Artificial Intelligence

Chunking

The process of breaking large documents into smaller pieces (chunks) before creating embeddings for use in RAG systems. Chunk size and strategy significantly impact retrieval quality.

Why It Matters

Chunking strategy directly determines RAG quality — too large and you retrieve irrelevant content, too small and you lose context. Getting it right is critical.

Example

Splitting a 100-page manual into overlapping 500-token chunks so that each chunk contains enough context to be useful when retrieved for answering questions.

Think of it like...

Like cutting a pizza — too few large slices and each is unwieldy, too many tiny pieces and you lose the toppings arrangement. The right size makes each piece perfect.

Related Terms