This class provides an introduction to Machine Learning and its core algorithms.
Teaching Assistants
Lu Meng (lumeng@stat.columbia.edu)Office hours: Tue 5:30-7:30pm, 1025 SSW (tenth floor, Department of Statistics)
Jingjing Zou (jingjing@stat.columbia.edu)
If you have questions on how your homework was graded, please address them to Jingjing.
Homework
- Number 1 (due: 13 Feb)
- Number 2 (due: 4 Mar)
Additional files: Digit data and fakedata.R - Number 3 (due: 3 Apr)
- Number 4 (due: 17 Apr)
Additional files: histograms.zip - Number 5 (due: 1 May)
Textbooks
The course is not based on a specific textbook. The relevant course materials are the slides.First half of the class
If you would like to complement lectures and slides by further reading, probably the best reference for the first half of the class (roughly up to the midterm) is:-
The Elements of Statistical Learning
T. Hastie, R. Tibshirani and J. Friedman.
Second Edition, Springer, 2009.
[Available online here]
Topic | Chapter |
---|---|
Linear classifiers, Perceptron | 4.1, 4.5 |
Maximum margin classifiers | SVMs |
Kernels | 12.3 |
Model selection and cross validation | 7, in particular 7.10 |
Trees | 9.2 |
Boosting | 10.1, 10.8 |
Bagging | 8.7 |
Random Forests | 15 |
Linear regression | 3.2 |
Shrinkage | 3.4 |
Second half of the class
There is unfortunately no single book that covers all topics in the second half of the class well, but some useful sources are:-
Pattern Recognition and Machine Learning.
Christopher M. Bishop.
Springer, 2006.
-
Machine Learning: A Probabilistic Perspective.
Kevin P. Murphy.
MIT Press, 2012.
-
Bayesian Reasoning and Machine Learning.
David Barber.
Cambridge University Press, 2012.
[Available online]
Other references
-
Information Theory, Inference, and Learning Algorithms.
David J. C. MacKay.
Cambridge University Press, 2003.
[Available online]
-
Pattern Classification.
Richard O. Duda, Peter E. Hart, David G. Stork.
Wiley, 2001.
-
Convex Optimization.
Stephen Boyd and Lieven Vandenberghe.
Cambridge University Press, 2004.
[Available online]
Syllabus
There will be five or six homework assignments; you will usually have two weeks to complete each homework. The final grade will be computed as
40% homework + 30% midterm + 30% final exam
The midterm will cover the material of the first half of the class. The final will cover
only the material covered after the midterm; you will not have to repeat everything all over again.
Preliminary list of topics
Week | Content |
---|---|
1 | Introduction |
Review of basic concepts: Maximum likelihood, Gaussian distributions, etc. | |
2 | Classification basics: Loss functions, naive Bayes, linear classifiers |
3 | Support vector machines, convex optimization |
4 | Kernels; model selection and cross validation |
5 | Ensemble methods: Boosting, bagging, random forests |
6 | Regression: Linear regression, regularization, ridge regression |
7 | Linear algebra review, high-dimensional and sparse regression |
8 | Dimension reduction, data visualization, principal component analysis |
9 | Clustering, mixture models and EM algorithms |
10 | Information theory; Text analysis |
11 | Markov models, PageRank |
12 | Hidden Markov models, speech recognition |
13 | Bayesian models |
14 | Sampling algorithms and MCMC |