Dealing with Cognitive Dissonance, the AI Way | by Yennie Jun | Jul, 2024


How do language models handle conflicting instructions in its prompt?

Towards Data Science
Given contradictory instructions in the system message, the prompt, and examples, which instructions will an LLM follow in its response? Created by the author.

How do language models handle conflicting instructions in a prompt?

Cognitive dissonance is a psychological term describing the mental discomfort experienced by an individual holding two or more contradictory beliefs. For example, if you’re at the grocery store and see a checkout lane for “10 items or fewer” but everyone in the line has 10 or more items, what are you supposed to do?

Within the context AI, I wanted to know how large language models (LLMs) deal with cognitive dissonance in the form of contradictory instructions (for example, prompting an LLM to translate from English to Korean, but giving examples of English to French translations).

In this article, I conduct experiments by providing LLMs with contradictory information to ascertain which of the contradictory information LLMs are more likely to align with.

As a user, you can tell an LLM what to do in one of three ways:

  • Directly describing the task in the system message
  • Directly describing the task in the normal…

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here