Gatsby Unit | Research

GATSBY COMPUTATIONAL NEUROSCIENCE UNIT

A. Taylan Cemgil

Signal Processing and Communications Laboratory, Dept. of Engineering, University of Cambridge, UK

Wednesday 17 January 2007, 16:00

Seminar Room B10 (Basement)

Alexandra House, 17 Queen Square, London, WC1N 3AR

Generative models for acoustic processing

The analysis of audio signals is central to the scientific understanding of human hearing abilities and in a broad spectrum of engineering applications ranging from sound localisation, to hearing aids or music information retrieval. Historically, the main mathematical tools are from signal processing: digital filtering theory, system identification and various transform methods such as Fourier techniques. In recent years, there is an increasing interest for statistical approaches and tools from machine learning.

The application of statistical techniques is quite natural: acoustical time series can be conveniently modelled using hierarchical signal models by incorporating prior knowledge from various sources: from physics or studies of human cognition and perception. Once a realistic hierarchical model is constructed, many tasks such as coding, analysis, restoration, transcription, separation, identification or resynthesis can be formulated consistently as Bayesian posterior inference problems.

In this talk, I will sketch my past and current work on audio and music signal analysis. In particular, I will focus on factorial switching state space models and illustrate how using this formalism realistic generative signal models can be developed for transcription, restoration and source separation. In this model class, certain changepoint models admit exact inference, otherwise efficient algorithms based on variational and stochastic approximation methods can be developed.