Statistical Models for Natural Sounds
Richard E. Turner
University College London PhD Thesis
It is important to understand the rich structure of natural sounds in
order to solve important tasks, like automatic speech recognition, and
to understand auditory processing in the brain. This thesis takes a
step in this direction by characterising the statistics of simple
natural sounds. We focus on the statistics because perception often
appears to depend on them, rather than on the raw waveform. For
example the perception of auditory textures, like running water, wind,
fire and rain, depends on summary-statistics, like the rate of falling
rain droplets, rather than on the exact details of the physical
source.
In order to analyse the statistics of sounds accurately it is
necessary to improve a number of traditional signal processing
methods, including those for amplitude demodulation, time-frequency
analysis, and sub-band demodulation. These estimation tasks are
ill-posed and therefore it is natural to treat them as Bayesian
inference problems. The new probabilistic versions of these methods
have several advantages. For example, they perform more accurately on
natural signals and are more robust to noise, they can also fill-in
missing sections of data, and provide error-bars. Furthermore,
free-parameters can be learned from the signal. Using these new
algorithms we demonstrate that the energy, sparsity, modulation depth
and modulation time-scale in each sub-band of a signal are critical
statistics, together with the dependencies between the sub-band
modulators. In order to validate this claim, a model containing
co-modulated coloured noise carriers is shown to be capable of
generating a range of realistic sounding auditory textures.
Finally, we explored the connection between the statistics of natural
sounds and perception. We demonstrate that inference in the model for
auditory textures qualitatively replicates the primitive grouping
rules that listeners use to understand simple acoustic scenes. This
suggests that the auditory system is optimised for the statistics of
natural sounds.
bibtex, pdf
Related publications:
|
|
 |