Reinforcement learning of songbird premotor representations in a spiking neural network model

Ila Fiete,1 Richard Hahnloser,3 Alexay Kozhevnikov,3 Michale Fee,3 and Sebestian Seung2

3Bell Labs Lucent Technology

The birdsong motor circuit is a hierarchical structure: nucleus HVC projects to premotor nucleus RA, which in turn drives motor neurons. Recent experiments show that RA-projecting HVC neurons have temporally sparse neural sequences that drive activity in RA. In this context, the role of RA appears to be the conversion of abstract neural sequences in HVC into motor activity.

In this study, we illustrate how an appropriate map of HVC to motor activity could be learned via plastic connections between HVC and RA. Such learning is commonly thought to be driven by reinforcement (Doya & Sejnowski 1995, Troyer & Doupe 2000), with a reward signal generated by comparing the bird's vocal output with an internally stored copy (template) of its tutor song.

We construct a reinforcement model with spiking neurons that learns HVC-to-RA connections in a feedforward network of HVC, RA, and a motor layer. We assume that HVC provides a sparse sequence, and learning is governed by a synaptic plasticity rule that exploits correlations between fluctuations in the motor output due to noisy neural inputs, and a positive scalar global reward that depends on the match between network output and the stored template. We explore motor fluctuations arising from the inherent stochasticity of HVC-to-RA synapses, or from (possibly LMAN-generated) noise injected into RA. The learning rule performs stochastic gradient ascent on the reward, and is robust over a wide range of parameters. Patterns of RA activity in the trained model are compared with data from zebra finches.