Evaluating Long Context Large Language Models | by Yennie Jun | Jul, 2024

There is a race towards language models with longer context windows. But how good are they, and how can we know?

The context window of language models have been growing at an exponential rate in the last few years. Figure created by the author.

This article was originally published on Art Fish Intelligence.

The context window of large language models — the amount of text they can process at once — has been increasing at an exponential rate.

In 2018, language models like BERT, T5, and GPT-1 could take up to 512 tokens as input. Now, in summer of 2024, this number has jumped to 2 million tokens (in publicly available LLMs). But what does this mean for us, and how do we evaluate these increasingly capable models?

The recently released Gemini 1.5 Pro model can take in up to 2 million tokens. But what does 2 million tokens even mean?

If we estimate 4 words to roughly equal about 3 tokens, it means that 2 million tokens can (almost) fit the entire Harry Potter and Lord of the Ring series.

(The total word count of all seven books in the Harry Potter series is 1,084,625. The total word count of all seven books in the Lord of the Ring series is 481,103. (1,084,625 +…

Evaluating Long Context Large Language Models | by Yennie Jun | Jul, 2024

There is a race towards language models with longer context windows. But how good are they, and how can we know?

Recent Articles

ByteDance Releases UI-TARS-1.5: An Open-Source Multimodal AI Agent Built upon a Powerful Vision-Language Model

ASUS patches critical router flaw that allows remote attacks

Mastering Carousels with GSAP: From Basics to Advanced Animation

Amazon Clears Out Bose Headphones, Now Twice as Cheap as AirPods Max and Near Black Friday Lows

The Easiest Way to Create Real-Time AI Voice Agents

Related Stories

Leave A Reply Cancel reply