Token
The basic unit of text that language models process. A token can be a word, part of a word, or a punctuation mark. Text is broken into tokens before being fed into an LLM, and the model generates output one token at a time.
Why It Matters
Token counts determine API costs, context window limits, and processing speed. Understanding tokens is essential for optimizing LLM usage and managing costs.
Example
The sentence 'I love AI' might be split into 3 tokens: 'I', ' love', ' AI'. The word 'unbelievable' might be split into 'un', 'believ', 'able'.
Think of it like...
Like breaking a sentence into Scrabble tiles — sometimes a tile is a whole word, sometimes it is just a piece of one, but the model works with these individual pieces.
Related Terms
Tokenizer
A component that converts raw text into tokens (numerical representations) that a language model can process. Different tokenizers split text differently, affecting model performance and efficiency.
Context Window
The maximum amount of text (measured in tokens) that a language model can process in a single interaction. It includes both the input prompt and the generated output. Larger context windows allow models to handle longer documents.
Byte-Pair Encoding
A subword tokenization algorithm that starts with individual characters and iteratively merges the most frequent pairs to create a vocabulary of subword units. It balances vocabulary size with handling of rare words.