In this study, we illustrate how an appropriate map of HVC to motor activity could be learned via plastic connections between HVC and RA. Such learning is commonly thought to be driven by reinforcement (Doya & Sejnowski 1995, Troyer & Doupe 2000), with a reward signal generated by comparing the bird's vocal output with an internally stored copy (template) of its tutor song.
We construct a reinforcement model with spiking neurons that learns HVC-to-RA connections in a feedforward network of HVC, RA, and a motor layer. We assume that HVC provides a sparse sequence, and learning is governed by a synaptic plasticity rule that exploits correlations between fluctuations in the motor output due to noisy neural inputs, and a positive scalar global reward that depends on the match between network output and the stored template. We explore motor fluctuations arising from the inherent stochasticity of HVC-to-RA synapses, or from (possibly LMAN-generated) noise injected into RA. The learning rule performs stochastic gradient ascent on the reward, and is robust over a wide range of parameters. Patterns of RA activity in the trained model are compared with data from zebra finches.