18th July 2005 — Anna Goldenberg: Sparse data Bayes Net Structure search (SBNS)

Our visitor Anna Goldenberg will give a talk:

Bayesian Networks have been successfully applied in many areas such as pharmaceutical, decision making by doctors, air control, marketing, etc. Structural learning of Bayesian Networks is usually a desirable but costly operation. In some domains it is possible to collect expert knowledge to manually create a structure for a Bayes Net. However, social networks, warehousing data, or supermarket purchasing records may contain hundreds of thousands of attributes. Providing expert Bayes Net structures in such cases is cumbersome if not impossible, even if, as in the case with many of those domains, the events are choices of very small subsets of the large pool of available entities. The complexity of existing algorithms for structural search prevents Bayes Net learning on datasets of that size.

I will describe an algorithm for tractable structural learning in Bayes Nets that explores structures on the local level. The algorithm exploits the computational efficiency of Frequent Sets for gathering statistics that are most likely to be useful for structure search given the assumption of sparse data. I will also present an empirical evaluation of the algorithm applied to several massive datasets and a few extensions that I’m currently pursuing.