Previous: 24th October 2006 — Collaborative filtering?
Next: 21st November 2006 — *IC

7th November 2006 — Space-partitioning data structures

Ed and Iain will work through some examples of speeding up standard machine learning algorithms through space-partitioning data structures. In particular we’ll discuss speeding up nearest neighbour search, k-means, and the extent to which regression problems can be improved.

Basically we will touch on some of the ideas in this tutorial: Data structures for fast statistics, Alexander Gray and Andrew Moore, School of Computer Science, Carnegie Mellon University, 2004, a tutorial presented at ICML.

An example paper to read is: Efficient locally weighted polynomial regression predictions, Andrew Moore, Jeff Schneider and Kan Deng, Fourteenth International Conference on Machine Learning, D. Fisher (editor), 236–244, 1997.

Also relevant:

The anchors hierarchy: using the triangle inequality to survive high-dimensional data, Andrew Moore, Twelfth Conference on Uncertainty in Artificial Intelligence, 397–405, 2000.
Rapid evaluation of multiple density models, Alexander G. Gray and Andrew W. Moore, Artificial Intelligence and Statistics, 2003.
Accelerating exact k-means algorithms with geometric reasoning, Dan Pelleg and Andrew Moore, Fifth International Conference on Knowledge Discovery in Databases, Surajit Chaudhuri and David Madigan (editors), 277–281, AAAI Press, 1999.

…and many others. Note we will not include more recent work using analytic expansions of kernels.

Iain Murray.