A Sanity Check on ‘Emergent Properties’ in Large Language Models | by Anna Rogers

LLMs are often said to have ‘emergent properties’. But what do we even mean by that, and what evidence do we have?

12 min read

Jul 15, 2024

One of the often-repeated claims about Large Language Models (LLMs), discussed in our ICML’24 position paper, is that they have ‘emergent properties’. Unfortunately, in most cases the speaker/writer does not clarify what they mean by ‘emergence’. But misunderstandings on this issue can have big implications for the research agenda, as well as public policy.

From what I’ve seen in academic papers, there are at least 4 senses in which NLP researchers use this term:

1. A property that a model exhibits despite not being explicitly trained for it. E.g. Bommasani et al. (2021, p. 5) refer to few-shot performance of GPT-3 (Brown et al., 2020) as “an emergent property that was neither specifically trained for nor anticipated to arise’”.

2. (Opposite to def. 1): a property that the model learned from the training data. E.g. Deshpande et al. (2023, p. 8) discuss emergence as evidence of “the advantages of pre-training’’.

3. A property “is emergent if it is not present in smaller models but is present in larger models.’’ (Wei et al., 2022, p. 2).

4. A version of def. 3, where what makes emergent properties “intriguing’’ is “their sharpness, transitioning seemingly instantaneously from not present to present, and their unpredictability, appearing at seemingly unforeseeable model scales” (Schaeffer, Miranda, & Koyejo, 2023, p. 1)

For a technical term, this kind of fuzziness is unfortunate. If many people repeat the claim “LLLs have emergent properties” without clarifying what they mean, a reader could infer that there is a broad scientific consensus that this statement is true, according to the reader’s own definition.

I am writing this post after giving many talks about this in NLP research groups all over the world — Amherst and Georgetown (USA), Cambridge, Cardiff and London (UK), Copenhagen (Denmark), Gothenburg (Sweden), Milan (Italy), Genbench workshop (EMNLP’23 @ Singapore) (thanks to everybody in the audience!). This gave me a chance to poll a lot of NLP researchers about what they thought of emergence. Based on the responses from 220 NLP researchers and PhD students, by far the most popular definition is (1), with (4) being the second most popular.

The idea expressed in definition (1) also often gets invoked in public discourse. For example, you can see it in the claim that Google’s PaLM model ‘knew’ a language it wasn’t trained on (which is almost certainly false). The same idea also provoked the following public exchange between a US senator and Melanie Mitchell (a prominent AI researcher, professor at Santa Fe Institute):

A Sanity Check on ‘Emergent Properties’ in Large Language Models | by Anna Rogers

LLMs are often said to have ‘emergent properties’. But what do we even mean by that, and what evidence do we have?

Recent Articles

Top 25+ Python Developer Interview Questions and Answers | by Python_full_stack_masters | Feb, 2025

Apple Watch shipments surge in India

Mistral-Small-24B-Instruct-2501 is now available on SageMaker Jumpstart and Amazon Bedrock Marketplace

FatalRAT Phishing Attacks Target APAC Industries Using Chinese Cloud Services

From Concept to Code: Inside the Creative Process of Thomas Monavon & Grégory Lallé

Related Stories

Leave A Reply Cancel reply