GPT-4V(ision) system card

April 23, 2024

GPT-4 with vision (GPT-4V) enables users to instruct GPT-4 to analyze image inputs provided by the user, and is the latest capability we are making broadly available. Incorporating additional modalities (such as image inputs) into large language models (LLMs) is viewed by some as a key frontier in artificial intelligence research and development. Multimodal LLMs offer the possibility of expanding the impact of language-only systems with novel interfaces and capabilities, enabling them to solve new tasks and provide novel experiences for their users. In this system card, we analyze the safety properties of GPT-4V. Our work on safety for GPT-4V builds on the work done for GPT-4 and here we dive deeper into the evaluations, preparation, and mitigation work done specifically for image inputs.

GPT-4V(ision) system card

Recent Articles

Today’s Hurdle hints and answers for April 18, 2025

Meta AI Introduces Perception Encoder: A Large-Scale Vision Encoder that Excels Across Several Vision Tasks for Images and Video

“Pretty” is in the eye of the beholder

RansomHouse Ransomware: What You Need To Know

10 Awesome MCP Servers – KDnuggets

Related Stories

Leave A Reply Cancel reply