Helen Wills Neuroscience Institute & School of Optometry, and Redwood
Center for Theoretical Neuroscience, UC Berkeley, USA
Friday 6 March 2009
Seminar Room B10 (Basement)
Alexandra House, 17 Queen Square, London, WC1N 3AR
Learning transformational invariants from natural movies
A key attribute of visual perception is the ability to extract invariances from visual input. Here we focus on transformational invariants - i.e., the dynamical properties that are invariant to form or spatial structure (e.g., motion). We show that a hierarchical, probabilistic model can learn to extract complex motion from movies of the natural environment. The model consists of two hidden layers: the first layer produces a sparse representation of the image that is expressed in terms of local amplitude and phase variables. The second layer learns the higher-order structure among the time-varying phase variables. After training on natural movies, the top layer units discover the structure of phase-shifts within the first layer. We show that the top layer units encode transformational invariants: they are selective for the speed and direction of a moving pattern, but are invariant to its spatial structure (orientation/spatial-frequency). The diversity of units in both the intermediate and top layers of the model provides a set of testable predictions for representations that might be found in V1 and MT. In addition, the model demonstrates how feedback from higher levels can influence representations at lower levels as a by-product of inference in a graphical model.
Joint work with Charles Cadieu.