Context Window
The maximum amount of text (measured in tokens) that a language model can process in a single interaction. It includes both the input prompt and the generated output. Larger context windows allow models to handle longer documents.
Why It Matters
Context window size determines what tasks an LLM can handle — from short Q&A to analyzing entire codebases or books. It is a key differentiator between models.
Example
Claude's 200K token context window can process an entire novel in one go, while earlier models with 4K tokens could only handle a few pages.
Think of it like...
Like the size of a desk — a small desk forces you to work with only a few papers at a time, while a massive desk lets you spread out an entire project and see everything at once.
Related Terms
Token
The basic unit of text that language models process. A token can be a word, part of a word, or a punctuation mark. Text is broken into tokens before being fed into an LLM, and the model generates output one token at a time.
Long Context
The ability of AI models to process very large amounts of input text — typically 100K tokens or more — enabling analysis of entire books, codebases, or document collections.
Retrieval-Augmented Generation
A technique that enhances LLM outputs by first retrieving relevant information from external knowledge sources and then using that information as context for generation. RAG combines the power of search with the fluency of language models.