21st November 2003 - CFG
Mark and Nathaniel will present on learning in stochastic context-free grammars. In particular, ML learning of grammar probabilities via EM, and Bayesian model selection of grammars.
A more complete list of readings follows to help understand the above.
EM (Inside-Outside) for PCFGs:
- Durbin et al. (1998) Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Chapter 9. Not available online, but a summarised version is available at http://www.biostat.wisc.edu/bmi776/SCFGs-for-RNA.pdf
- Manning et al. (1999) Foundations of Statistical Natural Language Processing. Chapter 11. Also not available online, but a summarised version is available at http://www.stanford.edu/class/cs224n/new_handouts/fsnlp-pcfg-slides-6.pdf
- You may also find the following notes useful: http://citeseer.nj.nec.com/496529.html
Bayesian Learning of Grammars:
- Stanley F. Chen (1995) Bayesian Grammar Induction for Language Modeling. http://citeseer.nj.nec.com/300206.html
- Andreas Stolcke (1994) Inducing Probabilistic Grammars by Bayesian Model Merging. http://citeseer.nj.nec.com/stolcke94inducing.html