Full Name
Siddharth Mitra
Primary Department, Unit, or Institute
Computer Science
University/Company
Yale
Talk Title
Convergence Of φ-divergence And φ-mutual Information Along Langevin Markov Chains
Abstract
The mixing time of a Markov chain determines when the marginal distribution of the Markov chain is close to the stationary distribution and can be studied in many statistical divergences such as KL divergence and chi-squared divergence, all the way to families of divergences such as Φ-divergences. However, the mixing time does not determine the dependency between samples along the Markov chain, which can be measured in terms of their mutual information, chi-squared mutual information, or more generally their Φ-mutual information. In this talk we study the mixing time of a popular class of Markov chains in Φ-divergence and also study the dependence between the iterates in Φ-mutual information. The Markov chains we focus on are the Langevin Dynamics in continuous-time, and the Unadjusted Langevin Algorithm and Proximal Sampler in discrete-time and we show that for these Markov chains, the Φ-divergence and the Φ-mutual information decreases exponentially fast. Our proof technique is based on showing that Strong Data Processing Inequalities (SDPIs) hold along the Markov chains. Time permitting, we discuss the Hamiltonian Monte Carlo algorithm and study its mixing time in KL divergence and the dependence between iterates in classical mutual information; this analysis is based on showing a one-step regularity property of the algorithm to upgrade Wasserstein mixing guarantees to KL divergence.

Based on joint work with Andre Wibisono and Jiaming Liang -- (https://arxiv.org/abs/2410.10699) and (https://arxiv.org/abs/2402.17067).
Siddharth Mitra