A Kernel TwoSample Test by Arthur Gretton, Karsten Borgwardt, Malte Rasch,


This package contains Matlab implementations of various kernelbased statistical hypothesis tests for the twosample problem, as described in GreEtAl07, GreEtAl09, and GreEtAl12.
We propose to test whether distributions P and Q are different on the basis of samples drawn from each of them, by finding a smooth function (the witness function) which is large on the points drawn from P, and small (as negative as possible) on the points from Q. We use as our test statistic the difference between the mean function values on the two samples, or maximum mean discrepancy (MMD): when this is large, the samples are likely from different distributions. Smoothness is enforced by restricting the witness function to a unit ball in a reproducing kernel Hilbert space. The MMD is an instance of an integral probability metric.
Four strategies may be used to calculate the test threshold:
Note that an earlier version of this test was proposed in BorEtAl06, however the current test more accurately estimates the null distribution, and should be used in preference to the earlier algorithm.
The archive contains two files: mmd.m is the main code, and U4thmoment.c contains additional optimised ccode for one of the test options. While the algorithm runs in standalone form, it is also possible to use it with the Spider machine learning toolbox. Code is written by Malte Rasch.
[GreEtAl12]  Gretton, A., K. Borgwardt, M. Rasch, B. Schoelkopf and A. Smola: A Kernel TwoSample Test. JMLR 2012. download 
[GreEtAl09]  Gretton, A., K. Borgwardt, M. Rasch, B. Schoelkopf and A. Smola: A Fast, Consistent Kernel TwoSample Test. NIPS 2009. download 
[GreEtAl07]  Gretton, A., K. Borgwardt, M. Rasch, B. Schoelkopf and A. Smola: A Kernel Method for the TwoSampleProblem. NIPS 2006. download 
[BorEtAl06]  Borgwardt, K., A. Gretton, M. Rasch, H.P. Kriegel, B. Schoelkopf and A. Smola: Integrating structured biological data by Kernel Maximum Mean Discrepancy. Bioinformatics 22(14), 19 (2006) download 