Felix A. Wichmann
Wednesday 8th March 2017
Ground Floor Seminar Room
25 Howland Street, London, W1T 4JG
Computational models of vision: From early vision to deep convolutional
Early visual processing has been studied extensively over the last decades. From these studies a relatively standard model emerged of the first steps in visual processing. However, most implementations of the standard model cannot take arbitrary images as input, but only the typical grating stimuli used in many of the early vision experiments. I will present an image based early vision model implementing our knowledge about early visual processing including oriented spatial frequency channels, divisive normalisation and optimal decoding. The model explains the classical psychophysical data reasonably well, matching the performance of the non-image based models for contrast detection, contrast discrimination and oblique masking data. Leveraging the advantage of an image based model, I show how well our model performs for detecting Gabors masked by patches of natural scenes. Finally, we observe that our model units are extremely sparsely activated: each natural image patch activates few units and each unit is activated by few stimuli. In computer vision recent and rapid advances in convolutional deep neural networks (DNNs) have resulted in image-based computational models of object recognition which, for the first time, rival human performance. However, although DNNs have undoubtedly proven their usefulness in computer vision, their usefulness as models of human vision is not yet equally clear. On the one hand, there is a growing number of studies finding similarities between DNNs trained on object recognition to properties of the monkey or human visual system. At the same time, however, there are, e.g., the well-known discrepancies as indicated by so-called adversarial examples. Given our knowledge of early visual processing, a potential source for this difference may already originate from differences in the processing of low-level features. We performed object identification experiments with DNNs and human observers on exactly the same images under conditions favouring single-fixation, purely feed-forward processing. Whilst we clearly find certain similarities, we also find strikingly non-human behaviour in DNNs, as well as marked differences between different DNNs despite similar overall object recognition performance. I will discuss possible reasons for our findings in the light of our knowledge of early visual processing in human observers.
Short biography: Felix Wichmann received his B.A. in Experimental Psychology (1994) as well as his D.Phil. (1999) from the University of Oxford. He was awarded his doctorate for the thesis “Some Aspects of Modelling Spatial Vision: Contrast Discrimination” under supervision of Bruce Henning. His studies were supported by scholarships from the German National Academic Foundation (1992-1997), University College Oxford (1992-1994), a Jubilee Scholarship from St. Hugh's College Oxford (1994-1997), as well as a Fellowship by Examination from Magdalen College Oxford (1998-2001). After post-doctoral research in Johan Wageman's group at the University of Leuven in Belgium, Felix worked as a research scientist in Bernhard Schölkopf's Empirical Inference Department at the Max Planck Institute for Biological Cybernetics in Tübingen, Germany, and thereafter held a Professorship in Modelling of Cognitive Processes at the Technical University of Berlin. Since 2011 he is Professor for Neural Information Processing at the Eberhard Karls Universität Tübingen as well as an adjunct Senior Scientist at the Max Planck Institute for Intelligent Systems. Felix is currently a member of the editorial board of Vision Research.