Gatsby Computational Neuroscience Unit

UCL Gatsby Unit

Alexander Smola

( Australian National University )

Tuesday 6th December 2011

16.00pm

B10 Seminar Room,

Alexandra House, 17 Queen Square, London, WC1N 3AR

Scaling Machine Learning to the Internet

In this talk I will give an overview over an array of highly scalable techniques for both observed and latent variable models. This makes them well suited for problems such as classification, recommendation systems, topic modeling and user profiling. I will present algorithms for batch and online distributed convex optimization to deal with large amounts of data, and hashing to address the issue of parameter storage for personalization and collaborative filtering. Furthermore, to deal with latent variable models I will discuss distributed sampling algorithms capable of dealing with tens of billions of latent variables on a cluster of 1000 machines.

The algorithms described are used for personalization, spam filtering, recommendation, document analysis, and advertising.