Fit Like You Sample: Sample-Efficient Generalized Score Matching from Fast Mixing Diffusions

Qin, Yilong; Risteski, Andrej

Abstract:Score matching is an approach to learning probability distributions parametrized up to a constant of proportionality (e.g. Energy-Based Models). The idea is to fit the score of the distribution, rather than the likelihood, thus avoiding the need to evaluate the constant of proportionality. While there's a clear algorithmic benefit, the statistical "cost'' can be steep: recent work by Koehler et al. 2022 showed that for distributions that have poor isoperimetric properties (a large Poincaré or log-Sobolev constant), score matching is substantially statistically less efficient than maximum likelihood. However, many natural realistic distributions, e.g. multimodal distributions as simple as a mixture of two Gaussians in one dimension -- have a poor Poincaré constant.
In this paper, we show a close connection between the mixing time of a broad class of Markov processes with generator $\mathcal{L}$ and an appropriately chosen generalized score matching loss that tries to fit $\frac{\mathcal{O} p}{p}$. This allows us to adapt techniques to speed up Markov chains to construct better score-matching losses. In particular, ``preconditioning'' the diffusion can be translated to an appropriate ``preconditioning'' of the score loss. Lifting the chain by adding a temperature like in simulated tempering can be shown to result in a Gaussian-convolution annealed score matching loss, similar to Song and Ermon, 2019. Moreover, we show that if the distribution being learned is a finite mixture of Gaussians in $d$ dimensions with a shared covariance, the sample complexity of annealed score matching is polynomial in the ambient dimension, the diameter of the means, and the smallest and largest eigenvalues of the covariance -- obviating the Poincaré constant-based lower bounds of the basic score matching loss shown in Koehler et al. 2022.

Comments:	41 pages
Subjects:	Machine Learning (cs.LG); Data Structures and Algorithms (cs.DS)
Cite as:	arXiv:2306.09332 [cs.LG]
	(or arXiv:2306.09332v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.09332

Computer Science > Machine Learning

Title:Fit Like You Sample: Sample-Efficient Generalized Score Matching from Fast Mixing Diffusions

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators