Population coding of categories
Laurent Bonnasse-Gahot1 and Jean-Pierre Nadal1,2
1 Centre d'Analyse et de Mathematique Sociales (CAMS, UMR 8557 CNRS-EHESS) Ecole des Hautes Etudes en Sciences Sociales, Paris, 2 Laboratoire de Physique Statistique (LPS, UMR 8550 CNRS-ENS-Paris 6-Paris 7) Ecole Normale Superieure, Paris

We analytically study the coding of a discrete set of categories (e.g. colors, phonemes) by a large assembly of neurons, each category being represented by input stimuli in a continuous space. One main outcome is to explain why and when one should expect more cells activated by stimuli close to the class boundaries, than for stimuli close to the prototypes of a class. For this analysis we consider population coding schemes, which can also be seen as instances of exemplar models proposed in the literature to account for phenomena in the psychophysics of categorization. We quantify the coding efficiency by the mutual information between the discrete categories and the neural code. From information theoretic bounds (Fano), we expect any learning mechanism that aims at minimizing the probability of error to maximize this mutual information. We characterize the properties of the most efficient codes in the limit of a large number of coding cells.

We show that in this limit, up to a constant this mutual information is given by (minus) the average over the input space of the ratio between two Fisher informations: at the denominator, the (usual) Fisher information Fcode(x), giving the sensibility of the population code to small changes in the input x; and at the numerator, the category related Fisher information Fcat(x), given in terms of the posterior probability of each class. This latter Fisher information can be written as the sum of the terms P'(c|x)2/P(c|x) over all the categories c. Typically the posterior probability P(c|x) has a smooth S-shape, which entails that |P'(c|x)| -- and therefore Fcat(x) -- is the greatest in the regions near the boundary between categories. Given the limited resources of a large but finite number of neurons, maximizing the mutual information between the categories and the neural activity implies Fisher information Fcode to be the greatest at the boundary between categories. The higher Fcode(x), the more discriminable two sensory inputs x and x+dx in the perceptual space given by the output of the neuronal population. In other words, category learning implies better cross-category discrimination than within-category discrimination, a perceptual phenomenon traditionally called categorical perception. We consider the optimal configuration of a population of neurons in terms of the preferred stimuli of the cells and the widths of their receptive field. Our results predict that, if there is adaptation to a given categorization task, then (1) cells will be specifically tuned to the dimensions relevant for the classification task at hand; (2) more neurons will be allocated at the class boundaries, and (3) these boundary-specific cells will have a sharper tuning curve along each relevant dimension. All these predictions find support in recent neurophysiological experiments done in the inferotemporal cortex of the monkey brain, a cortex area shown to be specifically involved in classification tasks.

These results concern regimes of reasonably well-defined classes and sufficiently large signal-to-noise ratio. We have also studied ill-defined classes and noisy/short-time processing: in that case, the main result is that, contrary to the previous situation, the cells receptive fields will avoid the class boundaries.

References: L. Bonnasse-Gahot and J.-P. Nadal, "Neural Coding of Categories", preprint, May 2007