Name
Holden Lee
Date & Time
Thursday, October 17, 2024, 4:30 PM - 5:00 PM
Holden Lee
Description

Speaker 14

Title
Efficiently Learning And Sampling Multimodal Distributions With Data-based Initialization
Abstract
We consider the problem of sampling a multimodal distribution with a Markov chain given a small number of samples from it. Although mixing can be arbitrarily slow, we show that if the Markov chain has a $k$th order spectral gap, initialization from a set of $\tilde O(k/\varepsilon^2)$ samples from the stationary distribution will, with high probability over the samples, efficiently give a sample that is $\varepsilon$-close in TV distance to the desired distribution. In particular, this applies to mixtures of $k$ distributions satisfying a Poincare inequality, and with faster convergence when they satisfy a log-Sobolev inequality. Our bounds are stable to perturbations to the Markov chain, and in particular work for Langevin diffusion over $\mathbb R^d$ with score estimation error, as well as Glauber dynamics with estimation from a pseudolikelihood estimator. This justifies the success of data-based initialization for score matching methods despite slow mixing for the data distribution. We improve and generalize the previous results to have linear rather than exponential dependence on $k$. Our results also show that low-complexity Ising measures can be efficiently learned from samples.