######################################################################## Sparse Gaussian Process Regression v1.0 --- post-backfitting matching pursuit --- pre-backfitting matching pursuit --- information gain approach --- baseline random selection NOTE: model parameters should be specified by user. SOURCE CODE: http://www.gatsby.ucl.ac.uk/~chuwei/code/sgpr.tar Wei CHU (C)Copyright 2005-2006 at Gatsby Unit. ######################################################################## 1. Installation a) download sgpr.tar to your machine (Linux is assumed) b) tar -xvf sgpr.tar c) cd sgpr d) the program of post-backfitting matching pursuit is generated by gcc -o sgpr *.c -lg2c -lm -O2 -Wall -D _GPR_KEERTHI e) the program of pre-backfitting matching pursuit is generated by gcc -o sgpr *.c -lg2c -lm -O2 -Wall -D _GPR_SMOLA f) the program of information gain approach is generated by gcc -o sgpr *.c -lg2c -lm -O2 -Wall -D _GPR_SEEGER g) the program of baseline random selection is generated by gcc -o sgpr *.c -lg2c -lm -O2 -Wall h) the program using ARD kernel is generated by gcc -o sgpr *.c -lg2c -lm -O2 -Wall -D _GPRARD i) the program using normalized inputs is generated by gcc -o sgpr *.c -lg2c -lm -O2 -Wall -D _GPRNORMALIZEINPUT j) the program using normalized targets is generated by gcc -o sgpr *.c -lg2c -lm -O2 -Wall -D _GPRNORMALIZETARGET 2. Data Format a) a space delimited plain text file with a new-line character at the end of each line b) each line contains one sample. c) On each line, the last element is the target, a real value e) training data saved in a text file, say "mytask_train.1". f) the data for test with/without targers saved in "mytask_test.1". g) the test targets are saved in "mytask_targets.1" if available. h) "mytask" could be any string rather than containing "train" and "test" followed by "_train" and "_test". i) see the example files under the folder: sinctoy_train.dat and sinctoy_test.dat 3. Input and Output a) prepare your data files "mytask_train.1" "mytask_test.1" "mytask_targets.1" (optional) b) run the program (specify your model parameters as well !!) ./sgpr mytask_train.1 c) if _GPRARD has been defined, the file "mytask_train.1.ard" could be supplied to specify the values of ARD parameters d) output files 1) "mytask_test.1.std" to save the std of predictive distribution of test data. 2) "mytask_test.1.mean" to save the mean of predictive distribution of test data. 3) a log file to save results for batch tasks. Each line contains: ase nmse nlpd aae rmse qgap time. 4. Options a) for help information ./sgpr b) specify initial noise variance, say 0.4 ./sgpr -S 0.4 mytask_train.1 c) specify initial kernel parameter, say 0.01 ./sgpr -K 0.01 mytask_train.1 d) specify the power level, say 3.0 ./gpor -O 3 mytask_train.1 e) specify the number of basis functions, say 1000 ./sgpr -D 1000 mytask_train.1 5. Reference "A matching pursuit approach to sparse Gaussian process regression", S. S. Keerthi and W. Chu NIPS-18 2005 http://www.gatsby.ucl.ac.uk/~chuwei/paper/sgpr.ps ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~