14th July 2003 - Data visualisation

Tom Minka has offered to provide the following review:

Data visualization: New methods and unsolved problems

Visualization is a powerful method for checking model assumptions and generally learning about data. However, the literature is very scattered, making it difficult to get a coherent picture of what has been done and what are the open problems. I will give a tour of methods for multivariate visualization, including some new methods, emphasizing the connections between them. In many cases, visualization is intertwined with inference, and requires optimization of a suitably chosen cost function, making it quite similar to problems in machine learning. I will also walk through examples of using visualization to learn about real datasets.

Some useful reading:

Lecture notes for Data Mining (specifically, the section on Multivariate Geometry) http://www.stat.cmu.edu/~minka/courses/36-350.2001/schedule.html

"Building statistical models by visualization" http://www.stat.cmu.edu/~minka/papers/viz.html

"Prosection Views: Dimensional Inference through Sections and Projections" G. W. Furnas, A. Buja, Journal of Computational and Graphical Statistics 3, 323-385 (1994) http://citeseer.nj.nec.com/furnas94prosection.html