Object Recognition

David Marr famously defined vision as knowing what is where by looking. Object recognition lies at the heart of our capacity to make sense of the visual world. Understanding object recognition is important not just for understanding the process by which different visual forms are assigned different identity labels, but for understanding many other fundamental neural processes that operate on high-level object representations, including consciousness, visual memory, decision making, and even language. Befitting the central importance and complexity of object recognition, a large piece of the brain, the inferotemporal cortex (IT cortex), is dedicated to solving this challenge.

Our lab has spent the last 15 years dissecting the neural mechanisms underlying face representation in IT cortex. Faces are among the most meaningful visual forms processed by the primate brain, conveying information about identity, expression, gender, age, and direction of attention. FMRI reveals six discrete regions that respond much more strongly to images of faces than to images of other objects in IT cortex, as well as one region in perirhinal cortex and several in prefrontal cortex. Anatomical experiments show that face patches are strongly and specifically interconnected. Electrophysiological experiments show that while all six IT patches contain high concentrations of face cells, different patches are functionally distinct, with a view-specific representation in patches ML/MF and a view invariant representation in patch AM. Our lab recently cracked the code for facial identity used by cells in patches ML/MF and AM (Chang and Tsao, 2017). We discovered that each face cell is linearly projecting incoming faces, represented as vectors in a “shape-appearance” face space, onto a specific axis , with cells in patch ML/MF specialized for shape axes and those in AM for appearance axes. Using this code, we found it is possible to decode any human face using just ~ 200 face cells. Overall, the face patch system offers a remarkable preparation to dissect the neural mechanisms underlying form perception, because the system is specialized to process one class of complex forms, and because its computational components are spatially segregated.

We are currently extending our work on the face patch system in multiple new directions, including understanding the mechanism for visual imagery and the mechanism by which the IT code for faces is transformed in downstream brain regions of the medial temporal lobe and prefrontal cortex.

Recognition
Organization and code of macaque IT face patches. (a) Six IT face patches shown on inflated right hemisphere of macaque brain. (b) A single cell in a face patch codes facial identity by linearly projecting incoming faces onto a specific axis (the “preferred axis” shown in red). Across the population, these axes tile a high-dimensional face space, “shape-appearance” space (Chang and Tsao, 2017). (c) Facial identity of realistic human faces can be accurately reconstructed by a simple linear transformation of face cell population response vectors. Shown on the left are two actual faces presented to a monkey, and on the right are reconstructions using responses of 205 cells from face patches ML/MF and AM (from Chang and Tsao, 2017).