ÿþ<HTML><HEAD> <TITLE>Sound Archive</TITLE> <link rel="stylesheet" type="text/css" href="http://www.gatsby.ucl.ac.uk/~turner/style-sheet.css" /> </HEAD> <!-- Google analytics --> <script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-11580771-1"); pageTracker._trackPageview(); } catch(err) {}</script> <BODY style="max-width:850px"> <H1>Sound Archive</H1> <P>This page contains a number of audio demonstrations for my thesis (<A HREF="http://www.gatsby.ucl.ac.uk/~turner/Publications/Thesis.pdf">Statistical Models for Natural Sounds</A>) and for various papers and talks that I have given.</P> <P><B>To give you a feel for the methods developed in this these, please listen to this introductory <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/IntroSound.wav">example</A></B>.</P> <P>This example is made from a collection of sounds recorded during a camping trip. Here, a person starts by a camp fire and then walks past a stream to their tent. The wind starts to howl and the person gets to their tent just in time before it rains at the end of the clip.</P> <P>Remarkably, all the sounds on this clip are synthetic (except for the sound of the closing zip). They are produced from a single 'generative model', which has been trained on natural sounds and learns how to produce natural sounding versions. The model is particularly good at synthesising auditory textures like fire, running water, wind and rain.</P> <P>The model works by learning the important statistics of these sounds. It can then produce new synthetic versions, of arbitrary duration, by ensuring the new sounds match the statistics of the original. This demonstrates that auditory textures are often defined statistically, a fact first demonstrated by <A HREF="http://www.cns.nyu.edu/~jhm/">Josh McDermott and Eero Simoncelli</A>.</P> <P>Characterising the statistics of natural sounds is important. For example, when your car's automatic speech recognition system tries to figure out what you are saying when there is traffic noise in the background, it will often fail. However, the performance can be enhanced be removing the traffic noise and this can be done by knowing the difference between the statistics of the traffic noise and speech.</P> <P>Importantly, by generating synthetic sounds, this work also reveals the statistics to which auditory processing is sensitive. This is an important practical tool for understanding how hearing operates.</P> <H2>Chapter 3: Probabilistic Amplitude Demodulation</H2> <P>Probabilistic amplitude demodulation is a new method we invented for estimating the envelope of a signal. Below, we illustrate the method by taking a training sound, extracting its envelope, and then generating a new sound using this envelope and a white noise carrier. It is clear from this example that the long-time "rhythm" of the sound remains, but the short-time frequency content is missing.</P> <UL> <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/PAD/Original.wav"> Original Sound used for training</A> <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/PAD/WhiteNoise.wav"> A new sound containing the inferred envelope multiplying white noise</A> <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/PAD/Carrier.wav"> The extracted carrier</A> <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/PAD/Sharpend.wav"> Envelope modification - the inferred envelope was sharpened and recombined with the carrier</A> <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/PAD/FMSample1.wav"> Sample from the GP-PAD(2) generative model trained on speech: Amplitude modulated Coloured Noise </A> </UL> <P>These examples were also referenced from the <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Publications/TS2007ICA.pdf"> paper </A> accepted and given an oral presentation at <A HREF="http://www.elec.qmul.ac.uk/ica2007/"> ICA 2007 </A> conference, London. This won the "Best Student Paper" award.</P> <H2>Chapter 5: Probabilistic Time-Frequency Representations</H2> <P>Probabilistic time-frequency representations are complementary to <A HREF="http://en.wikipedia.org/wiki/Time frequency_representation">traditional time-frequency representations</A>. They were developed in Chapter 5 of my thesis. This new type of representation is slower to estimate than traditional methods, but once it has been estimated it is very simple to resynthesise modified sounds from it. For example, below we illustrate how to modify the duration of a sound, and also how to modify the pitch of the sound.</P> <UL> <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/PHT/74Original.wav"> Original Sound used for training</A> <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/PHT/74UpSampled.wav"> Upsampling by a factor of 2 via linear interpolation of the original waveform lengthens the stimulus, but shifts the frequencies downwards by a factor of 2.</A> <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/PHT/74Dilated.wav"> Upsampling the probabilistic STFT coefficients and resynthesising the stimulus lengthens the stimulus, but maintains the frequency components. </A> <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/PHT/74PitchShiftUp.wav"> Shifting the pitch upwards by 40 percent by resynthesis. </A> <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/PHT/74PitchShiftDown.wav"> Shifting the pitch downwards by 40 percent by resynthesis. </A> </UL> <H2>Chapter 5: MPAD (and ICASSP) synthetic auditory textures</H2> <P>In this section the goal is produce synthetic versions of natural sounds by learning their statistics and generating new versions which match those statistics. In other words, for each of the models below, the model parameters were learned from a training sound (named on the left hand side), and then entirely new sounds were generated using those learned parameters.</P> <table border="0"> <tr> <td>Stream 1</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/05Original.wav">Original</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/05Backwards.wav">Reversed</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/05AR2.wav">AR2 matched spectra</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/05MPADInd.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/05MPADCo.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>Stream 2</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/41Original.wav">Original</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/41Backwards.wav">Reversed</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/41AR2.wav">AR2 matched spectra</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/41MPADInd.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/41MPADCo.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>Stream 3</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/83Original.wav">Original</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/83Backwards.wav">Reversed</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/83AR2.wav">AR2 matched spectra</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/83MPADInd.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/83MPADCo.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>Wind 1</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/75Original.wav">Original</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/75Backwards.wav">Reversed</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/75AR2.wav">AR2 matched spectra</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/75MPADInd.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/75MPADCo.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>Wind 2</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/81Original.wav">Original</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/81Backwards.wav">Reversed</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/81AR2.wav">AR2 matched spectra</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/81MPADInd.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/81MPADCo.wav">Co-modulation (MPAD)</A></td> </tr> <td>Fire 1</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/24Original.wav">Original</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/24Backwards.wav">Reversed</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/24AR2.wav">AR2 matched spectra</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/24MPADInd.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/24MPADCo.wav">Co-modulation (MPAD)</A></td> </tr> <td>Fire 2</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/80Original.wav">Original</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/80Backwards.wav">Reversed</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/80AR2.wav">AR2 matched spectra</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/80MPADInd.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/80MPADCo.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>Rain</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/54Original.wav">Original</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/54Backwards.wav">Reversed</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/54AR2.wav">AR2 matched spectra</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/54MPADInd.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/54MPADCo.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>Applause</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/85Original.wav">Original</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/85Backwards.wav">Reversed</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/85AR2.wav">AR2 matched spectra</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/85MPADInd.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/85MPADCo.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>Snapping Twigs</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/76Original.wav">Original</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/76Backwards.wav">Reversed</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/76AR2.wav">AR2 matched spectra</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/76MPADInd.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/76MPADCo.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>Speech</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/74Original.wav">Original</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/74Backwards.wav">Reversed</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/74AR2.wav">AR2 matched spectra</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/74MPADInd.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/74MPADCo.wav">Co-modulation (MPAD)</A></td> </tr> </table> </br> <P>The "original" sounds are the training sounds. In some cases these are very short. The "AR2 matched spectra" sounds comprise a sum of AR(2) processes with parameters chosen to match the training spectra. These sounds are therefore Gaussian noise with spectra parameterised by the AR(2) processes.The "independent modulation" sounds are formed from a sum of independently modulated AR(2) processes. The "Co-modulation" sounds are formed from a sum of comodulated AR(2) processes. In this way, the models get more complicated from left to right across the table. Similarly, the complexity of the sounds increases down the table. So, whilst water is captured relatively well by independently modulated AR(2) processes, fire requires co-modulated processes to capture the crackles. Rain also requires comodulation to capture the sound of the droplets hitting leaves, but because this sound is asymmetric through time, the models - whose statistics are invariant under a reversal of time - cannot perfectly capture it. Similarly, speech cannot be captured because of this, and other higher-order statistics, which the models do not capture.</P> <P> Here is some more detail about some of the sounds: The first wind sound is dominated by only three patterns of comodulation as can be demonstrated by the following sounds:</P> <br><br> <table border="0"> <tr> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/75MPADCo.wav">Full Generated Sound</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/75Wind.wav">First three components</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/75Extra.wav">Remaining 12 components</A></td> </tr> </table> </br> <P>Similarly, the first fire sound is dominated by one component which handles the crackling sound. The remaining components capturing the slower aspects of the fire sound:</P> <br><br> <table border="0"> <tr> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/24MPADCo.wav">Full Generated Sound</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/24Crack.wav">First component</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/24NoCrack.wav">Remaining 27 components</A></td> </tr> </table> </br> <P>Once again, the rain sound is dominated by two components which handle the transient sound of the water droplets hitting the leaves. The remaining components capturing the slower aspects of the rain sound:</P> <br><br> <table border="0"> <tr> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/54MPADCo.wav">Full Generated Sound</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/54Drop.wav">First two components</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/54NoDrop.wav">Remaining 27 components</A></td> </tr> </table> </br> <P>The generated speech sound contains a number of different simple phoneme-like components.</P> <br><br> <table border="0"> <tr> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/74MPADCo.wav">Full Generated Sound</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/74Component1.wav">First component</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/74Component2.wav">Second component</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/74Component3.wav">Third component</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/74Component4.wav">Fourth component</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/74Component5.wav">Fifth component</A></td> </tr> </table> </br> <H2>Chapter 5: Chimera</H2> <P> These auditory chimera contain the carriers inferred from one sound and the modulators inferred from another, possibly synthetic, sound. They indicate the aspects of the sounds which are captured by the two types of processes. </P> <table border="0"> <tr> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/SpeechAmpSinCar.wav">Modulators from a speech sound, sinusoidal carriers at the filter centre frequencies</A></td> </tr> <tr> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/SpeechAmpAR2Car.wav">Modulators inferred from a speech sound, random carriers drawn from AR(2) processes</A></td> </tr> <tr> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/ConstAmpSpeechCar.wav">Constant amplitude, carriers inferred from a speech sound (posterior mean)</A></td> </tr> <tr> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/ConstAmpSpeechCarSampled.wav">Constant amplitude, carriers inferred from a speech sound (sample from the posterior)</A></td> </tr> </table> </br> <P>When using MPAD (see <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Publications/Thesis.pdf">Chapter 5 of my thesis</A>) to carry out sub-band demodulation, the posterior mean over the carriers often appears to contain modulation. That it is, it appears like MPAD is not demodulating the sub-bands fully. However, it turns out that this feature is due to the posterior mean not being typical of the posterior distribution. Consider inferring a carrier when the associated amplitude is very small (compared to the observation noise). The posterior mean of the carrier reverts to the prior mean which is zero. This means that the posterior mean of the carriers tends to contain more energy when the amplitude is large than when it is small. However, the posterior variance of the carriers is higher in regions of low amplitude. Therefore, a sample from the posterior distribution over the carriers, contains rather less modulation than the posterior mean. For this reason, chimera should be produced using samples from the posterior distribution over the carriers, rather than from the posterior mean itself.</P> <H2>Chapter 5: Filling in missing data experiments</H2> <P>Various different generative models were used to fill in missing sections of speech. </P> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/Original.wav">The original speech sound.</A> <br> <table border="0"> <tr> <td>1.25ms</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/10Miss.wav">Missing</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/10PHT.wav">Bayesian Spectrum Estimation</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/10AR2.wav">Trained AR(2) filter bank</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/10MPADComp.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/10MPAD.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>6.25ms</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/50Miss.wav">Missing</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/50PHT.wav">Bayesian Spectrum Estimation</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/50AR2.wav">Trained AR(2) filter bank</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/50MPADComp.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/50MPAD.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>9.375ms</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/75Miss.wav">Missing</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/75PHT.wav">Bayesian Spectrum Estimation</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/75AR2.wav">Trained AR(2) filter bank</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/75MPADComp.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/75MPAD.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>12.5ms</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/100Miss.wav">Missing</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/100PHT.wav">Bayesian Spectrum Estimation</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/100AR2.wav">Trained AR(2) filter bank</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/100MPADComp.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/100MPAD.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>15.625ms</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/125Miss.wav">Missing</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/125PHT.wav">Bayesian Spectrum Estimation</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/125AR2.wav">Trained AR(2) filter bank</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/125MPADComp.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/125MPAD.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>18.75ms</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/150Miss.wav">Missing</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/150PHT.wav">Bayesian Spectrum Estimation</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/150AR2.wav">Trained AR(2) filter bank</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/150MPADComp.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/150MPAD.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>25ms</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/200Miss.wav">Missing</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/200PHT.wav">Bayesian Spectrum Estimation</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/200AR2.wav">Trained AR(2) filter bank</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/200MPADComp.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/200MPAD.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>31.25ms</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/250Miss.wav">Missing</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/250PHT.wav">Bayesian Spectrum Estimation</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/250AR2.wav">Trained AR(2) filter bank</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/250MPADComp.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/250MPAD.wav">Co-modulation (MPAD)</A></td> </tr> <tr> <td>37.5ms</td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/300Miss.wav">Missing</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/300PHT.wav">Bayesian Spectrum Estimation</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/300AR2.wav">Trained AR(2) filter bank</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/300MPADComp.wav">Independent Modulation (MPAD)</A></td> <td><A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/mpad/Missing/300MPAD.wav">Co-modulation (MPAD)</A></td> </tr> </table> </br> <!-- <H2>Probabilistic Amplitude and Frequency Demodulation</H2> From the <A HREF="http://www.gatsby.ucl.ac.uk/~turner/PublicTalks/ICA2007/ICA2007.pdf"> talk </A> presented at <A HREF="http://www.elec.qmul.ac.uk/ica2007/"> ICA 2007 </A> conference, London. <UL> <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/PAD/Carrier.wav"> Sample carrier </A> <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/PAD/Sample.wav"> Sample sound </A> </UL> --> <!-- <H2>Modulation Cascade Process</H2> From the <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Publications/MCP.pdf"> paper</A> submitted to NIPS this year <UL> <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/MCP/Y_original.wav"> Segment of original sounds used for training </A>. <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/MCP/MCPFM.wav"> One sample from the foward model that has been trained on the spoken sentence above</A>. <LI> <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/MCP/MCPFM2.wav"> A second sample from the foward model that has been trained on the spoken sentence above</A>. </UL> --> <H2>Auditory Scene Analysis</H2> <P>From the research <A HREF="http://www.gatsby.ucl.ac.uk/~turner/ResearchTalks/ASA/ASART.pdf"> talk</A> (and Cosyne 2008 poster)</P> <UL> <LI> Grouping principle 1: Good continuation. <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/ASA/TonePips.wav"> Tone Pips </A>, <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/ASA/Glisandi.wav"> Glissandi </A>. Faster version: <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/ASA/TonePipsFast.wav"> Tone Pips </A>, <A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/ASA/GlisandiFast.wav"> Glissandi </A>. <LI> Grouping principle 2: Closure.<A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/ASA/GlisandiGap.wav"> Glissandi with small gaps</A>,<A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/ASA/GlisandiNoise.wav"> Glissandi with small gaps filled by noise</A>. <LI> Grouping principle 3: Common fate.<A HREF="http://www.gatsby.ucl.ac.uk/~turner/Sounds/ASA/HarmonicStacks.wav"> Harmonic stacks</A>: First half contains four sinusoids with no modulation. Second half contains same four components, but pairs are modulated independently in frequency. </UL> <hr> <P><A HREF="http://www.gatsby.ucl.ac.uk/~turner/index.html">Return to main page</A></P> </BODY></HTML>