Arthur Gretton

links

home group publications talks teaching workshops software Gatsby Unit ELLIS Unit, UCL

Contact arthur.gretton@gmail.com
Gatsby Computational Neuroscience Unit
Sainsbury Wellcome Centre
25 Howland Street
London W1T 4JG UK

Recent courses

COMP 0083: Advanced Topics in Machine Learning (2024)
This course comprises 15 hours on kernel methods, and 15 hours on convex optimization. The kernel methods part covers: construction of RKHS, in terms of feature spaces and smoothing properties; simple linear algorithms in RKHS (PCA, ridge regression); kernel methods for hypothesis testing (two-sample, goodness-of-fit, independence); support vector machines for classification; and further applications of kernels (feature selection, clustering, ICA). There is an additional component (not assessed) on theory of reproducing kernel Hilbert spaces.
Course web page
Causal Effect Estimation with Context and Confounders (2024)
Course given at MLSS 2024 (OIST, Okinawa). Covers causal effect estimation with observed confounders (averave treatment effect, conditional average treatment effect, average treatment on treated, mediation effect) and with hidden confounders (instrumental variable regression, proximal causal learning), using kernel and adaptive neural net approaches.
Slides 1
Slides 2
Kernel methods for hypothesis testing, causality, and generative models (2023)
Course given at the Columbia Statistics Department . Covers two-sample testing with the MMD, goodness-of-fit testing with the kernel Stein discrepancy, causal effect estimation with context and confounders, Generalized Energy-Based Models, and MMD diffusion models.
Course web page
Kernel methods for comparing distributions and training generative models (2022)
Course given at the Online Asian Machine Learning School . Covers the MMD, two-sample testing, GAN training with MMD, and Generalized Energy-Based Model training using the KL Approximate Lower-Bound Estimator (KALE).
Slides 1
Slides 2
Probability Divergences, Generative Models, and Causality (2022)
Course given at the DeepLearn 2022 Summer School. First lecture: MMD and two-sample testing. Second lecture: GAN training with MMD, and Generalized Energy-Based Model training using the KL Approximate Lower-Bound Estimator (KALE). Third lecture: Causal modelling with kernels: treatment effects, counterfactuals, mediation, and proxies.
Slides 1
Slides 2
Slides 3
Introduction to kernel methods and applications in comparing distributions (2020)
Course given at the Machine learning summer school in Tuebingen (virtual). First lecture: introduction to reproducing kernel Hilbert spaces and their construction. Second lecture: the maximum mean discrepancy (MMD), two-sample testing with the MMD, training generative adversarial networks using an MMD critic.
Slides 1 and video
Slides 2 and video
Lectures on distribution embeddings for GANs, model fitting, and dependence detection (Athens 2018, Paris 2019, Tokyo 2019)
A short course for the machine learning with kernels course at the University of Paris Saclay, also given at the Greek Stochastics Workshop, and partly covered at the workshop on functional inference and machine intelligence in Tokyo. The first half introduces the maximum mean discrepancy (MMD), how to maximise the test power of the MMD for benchmarking generative adversarial networks (GANs), and how to regularise the MMD when training GANs. The second half describes the kernel Stein discrepancy for goodness-of-fit testing, and the Hilbert-Schmidt Independence Criterion (HSIC) for dependence testing.
Slides 1
Slides 2
Machine Learning Summer School (Madrid 2018, Paris 2018)
A short course on kernels for the Machine Learning Summer School in Madrid and the Data Science Summer School in Paris, covering reproducing kernel Hilbert spaces, distribution embeddings for hypothesis testing and training generative adversarial networks, and feature-space covariance for dependence testing. The more detailed slides from the Madrid MLSS are provided.
Slides 1, and video
Slides 2, and video
Slides 3, and video
Accompanying practical session for the Paris summer school by Heiko Strathmann.
Machine Learning Summer School (Cadiz and Arequipa 2016, Tuebingen 2015)
A short course on kernels for the Machine Learning Summer Schools in Tuebignen , Cadiz , and Arequipa . The first lecture covers the fundamentals of reproducing kernel Hilbert spaces. The second lecture introduces distribution embeddings, characteristic kernels, hypothesis testing, and optimal kernel choice for testing. The third lecture covers advanced topics: three-variable interactions, covariance in feature spaces, kernels that induce energy distances, and Bayesian inference with kernels.
Introduction
First lecture and video
Second lecture and video
Third lecture and video
Note: videos are from the Tuebingen summer school, but some of the slides are from more recent summer schools and have udpates.
The UAI 2017 tutorial slides (see talks) are more recent still, and are less techincally detailed but more polished.
Short Course for the Workshop on Nonparametric Measures of Dependence, Columbia 2014
A short course on kernels for the Nonparametric Measures of Dependence workshop at Columbia. The course covers three nonparametric hypothesis testing problems: (1) Given samples from distributions p and q, a homogeneity test determines whether to accept or reject p=q; (2) Given a joint distribution p_xy over random variables x and y, an independence test investigates whether p_xy = p_x p_y, (3) Given a joint distribution over several variables, we may test for whether there exist a factorization (e.g., P_xyz = P_xyP_z, or for the case of total independence, P_xyz=P_xP_yP_z). The tests benefit from many years of machine research on kernels for various domains, and thus apply to distributions on high dimensional vectors, images, strings, graphs, groups, and semigroups, among others. The energy distance and distance covariance statistics are also shown to fall within the RKHS family.
Course web page
Short Course for the Workshop on Kernel Methods for Big Data, Lille 2014
A short course on kernels for the Kernel methods for big data workshop. The first lecture is an introduction to RKHS. The second covers embeddings of probabilities to RKHS and characteristic kernels. The third lecture covers advanced topics: relation of RKHS embeddings of probabilities and energy distances, optimal kernel choice for two-sample testing, testing three-way interactions, and Bayesian inference without models.
Note that the Columbia course covers the topics of Lectures 1 and 2 in greater depth, but does not cover all the topics in Lecture 3.
First lecture
Second lecture
Third lecture
Introduction to Machine Learning, short course on kernel methods
This course comprises three hours of lectures, and a three hour practical session. Material includes construction of RKHS, in terms of feature spaces and smoothing properties; simple linear algorithms in RKHS (maximum mean discrepancy, ridge regression); and support vector machines for classification.
Course web page