The Helmholtz Machine
Peter Dayan   Geoff Hinton   Radford Neal
  Rich Zemel
Neural Computation, 7, 889-904.
Abstract
Discovering the structure inherent in a set of patterns is a
fundamental aim of statistical inference or learning. One fruitful
approach is to build a parameterised stochastic generative model,
independent draws from which are likely to produce the patterns. For
all but the simplest generative models, each pattern can be generated
in exponentially many ways. It is thus intractable to adjust the
parameters to maximize the probability of the observed patterns, We
describe a way of finessing this combinatorial explosion by maximising
an easily computed lower bound on the probability of the observations.
Our method can be viewed as a form of hierarchical self-supervised
learning that may relate to the function of bottom-up and top-down
cortical processing pathways.
compressed postscript   pdf