welling-hinton: Approximate Contrastive Free Energies for Learning in Undirected Graphical Models

home
people
the greater gatsby
research

annual report
publications
seminars
travel
vacancies
search
ucl

Approximate Contrastive Free Energies for Learning in Undirected Graphical Models

Max Welling and Geoffrey E. Hinton

Gatsby Computational Neuroscience Unit
University College London
17 Queen Square
London WC1N 3AR, UK

GCNU TR 2001-004

Abstract

We present a novel class of learning algorithms for undirected graphical models, based on the contrastive free energy (CF ). In particular we study the naive mean field, TAP and Bethe approximations to the contrastive free energy. The main advantage of CF learning is the fact that it elim-inates the need to infer equilibrium statistics for which mean field type approximations are particularly unsuitable. Instead, learning decreases the distance between the data distribution and the distribution with one-step reconstructions clamped on the visible nodes. We test the learning algorithm on the classification of digits.

Introduction

When learning undirected graphical models from data we change the parameters such that the model distribution is matched with the data distribution. To compute the statistics of the model distribution we need to perform inference in a network with no evidence clamped on any of its nodes. However, for a large class of models inference is intractable and approxi-mate methods need to be employed. A wide variety of approximate inference methods are now available, like variational approximations, Markov Chain Monte Carlo (MCMC) sam-pling and more recently loopy Belief Propagation. Unfortunately, these methods typically fail when no external evidence is present (and the correlations are not weak), since the dis-tribution is then likely to be highly multimodal. In this regime variational approximations fail to capture the complicated dependencies between the random variables, MCMC meth-ods suffer from extremely slow equilibration and Belief Propagation does not converge or gives poor results. In this paper we argue therefore that instead of trying to (marginally) improve our inference methods it may be more fruitful to look for alternative learning objectives which avoid the need to compute equilibrium statistics. As one such learning objective we advocate the contrastive free energy (CF ), introduced by (Hinton 2000) in the context of "restricted Boltzmann machines". In this paper we will extend the use of CF for general undirected graphical models in the context of deterministic approximations like the naive mean field (MF), TAP and Bethe approximations.

Download: [ps.gz] or [pdf]