The theory of deep neural networks
Deep neural networks have revolutionised machine learning.
But what makes deep networks so effective?
We give rigorous theory, describing how the flexibility in finite (but not infinite) neural networks shapes representations to solve difficult tasks.
Adaptive stochastic gradient descent as Bayesian filtering
How should we train our neural network?
There is no easy answer: many, many algorithms have been suggested, and at present, there is no easy way to choose between them.
Remarkably, we can show that three of the most important techniques: Adam, decoupled weight decay, and RAdam arise by considering stochastic gradient descent as a Bayesian inference problem.
How can we perform accurate inference in large-scale models with rich statistical structure.
Here, we apply insights from classical approaches such as particle filtering and message passing, to obtain exponentially many importance samples in state-of-the-art deep variational autoencoders
What do our neural networks know? And more importantly, what don't they know?
Here, we apply ideas from areas ranging from neuroscience to the theory of deep networks to the problem of reasoning accurately about uncertainty in neural network parameters.
Flow-based models with structured priors
How can deep models learn about the structure of the world without explicit supervision?
Here, we impose high-level, interpretable structure on the neural representations induced by state-of-the-art flow-based models of natural stimuli.
Variability and uncertainty in neural systems
How can the brain compute efficiently under energetic constaints? And how can the brain represent uncertainty about the world?
Remarkably, we have been able to show that the solution to these problems is one and the same: efficient computation automatically reasons about uncertainty, and reasoning about uncertainty allows the brain to compute efficiently.