Approximate
Contrastive Free Energies for Learning in Undirected Graphical Models
Max Welling and Geoffrey
E. Hinton
Gatsby Computational Neuroscience Unit
University College London
17 Queen Square
London WC1N 3AR, UK
GCNU TR 2001-004
Abstract
We present a novel class of learning algorithms for undirected
graphical models, based on the contrastive free energy (CF ). In particular
we study the naive mean field, TAP and Bethe approximations to the contrastive free
energy. The main advantage of CF learning is the fact that it elim-inates the need
to infer equilibrium statistics for which mean field type approximations are particularly
unsuitable. Instead, learning decreases the distance between the data distribution and the
distribution with one-step reconstructions clamped on the visible nodes. We test the
learning algorithm on the classification of digits.
Introduction
When learning undirected graphical models from data we change the
parameters such that the model distribution is matched with the data distribution. To
compute the statistics of the model distribution we need to perform inference in a network
with no evidence clamped on any of its nodes. However, for a large class of models
inference is intractable and approxi-mate methods need to be employed. A wide variety of
approximate inference methods are now available, like variational approximations, Markov
Chain Monte Carlo (MCMC) sam-pling and more recently loopy Belief Propagation.
Unfortunately, these methods typically fail when no external evidence is present (and the
correlations are not weak), since the dis-tribution is then likely to be highly
multimodal. In this regime variational approximations fail to capture the complicated
dependencies between the random variables, MCMC meth-ods suffer from extremely slow
equilibration and Belief Propagation does not converge or gives poor results. In this
paper we argue therefore that instead of trying to (marginally) improve our inference
methods it may be more fruitful to look for alternative learning objectives which avoid
the need to compute equilibrium statistics. As one such learning objective we advocate the
contrastive free energy (CF ), introduced by (Hinton 2000) in the context of
"restricted Boltzmann machines". In this paper we will extend the use of CF for
general undirected graphical models in the context of deterministic approximations like
the naive mean field (MF), TAP and Bethe approximations.
Download: [ps.gz] or [pdf]