Integrating Neural Systems for Visual Perception: The Role of Ventral Temporal Cortex VTC and Medial Temporal Cortex MTC in Rapid and Complex Object Recognition


Human and primate perception occurs across multiple timescales, with some visual attributes identified in under 200ms, supported by the ventral temporal cortex (VTC). However, more complex visual inferences, such as recognizing novel objects, require additional time and multiple glances. The high-acuity fovea and frequent gaze shifts help compose object representations. While much is understood about rapid visual processing, less about integrating visual sequences is known. The medial temporal cortex (MTC), particularly the perirhinal cortex (PRC), may aid in this process, enabling visual inferences beyond VTC capabilities by integrating sequential visual inputs.

Stanford researchers evaluated the MTC’s role in object perception by comparing human visual performance to macaque VTC recordings. While humans and VTC perform similarly with brief viewing times (<200ms), human performance significantly surpasses VTC with extended viewing. MTC plays a key role in this improvement, as MTC-lesioned humans perform like VTC models. Eye-tracking experiments revealed that humans use sequential gaze patterns for complex visual inferences. These findings suggest that MTC integrates visuospatial sequences into compositional representations, enhancing object perception beyond VTC capabilities.

Researchers used a dataset of various object images presented in different orientations and settings to estimate performance based on VTC responses and compare it with human visual processing. They implemented a cross-validation strategy where trials featured two typical objects and one outlier in randomized configurations. Neural responses from the brain’s high-level visual areas were then used to train a linear classifier to detect the odd object. This process was repeated multiple times, with results averaged to determine a performance score for distinguishing each pair of objects.

For comparison, a CNN model, pre-trained for object classification, was used to evaluate VTC model performance. The images were preprocessed for the CNN, and a similar experimental setup was followed, where a classifier was trained to detect odd objects in various trials. The model’s accuracy was tested and compared to neural response-based predictions, offering insights into how closely the model’s visual processing mirrored human-like inference.

The study compares human performance in two visual regimes: time-restricted (less than 200ms) and time-unrestricted (self-paced). In time-restricted tasks, participants rely on immediate visual processing since there’s no opportunity for sequential sampling through eye movements. A 3-way visual discrimination task and a match-to-sample paradigm were used to assess this. Results showed a strong correlation between time-restricted human performance and the performance predicted by the high-level VTC of macaques. However, with unlimited viewing time, human participants significantly outperformed VTC-supported performance and computational models based on VTC. This demonstrates that humans exceed VTC capabilities when given extended viewing times, suggesting reliance on different neural mechanisms.

The study reveals complementary neural systems in visual object perception, where the VTC enables rapid visual inferences within 100ms, while the MTC supports more complex inferences through sequential saccades. Time-restricted tasks align with VTC performance, but with more time, humans surpass VTC capabilities, reflecting MTC’s integration of visuospatial sequences. The findings emphasize MTC’s role in compositional operations, extending beyond memory to perception. Models of human vision, like convolutional neural networks, approximate VTC but fail to capture MTC’s contributions, suggesting the need for biologically plausible models that integrate both systems.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

Don’t Forget to join our 50k+ ML SubReddit

⏩ ⏩ FREE AI WEBINAR: ‘SAM 2 for Video: How to Fine-tune On Your Data’ (Wed, Sep 25, 4:00 AM – 4:45 AM EST)


Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.

⏩ ⏩ FREE AI WEBINAR: ‘SAM 2 for Video: How to Fine-tune On Your Data’ (Wed, Sep 25, 4:00 AM – 4:45 AM EST)



Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here