Sequence-to-Sequence
A model architecture that transforms one sequence into another, where the input and output can be different lengths. It uses an encoder to process input and a decoder to generate output.
Why It Matters
Seq2seq models power machine translation, text summarization, question answering, and code generation — any task where you need to convert one form of text to another.
Example
A translation model taking the English sequence 'How are you?' and generating the French sequence 'Comment allez-vous?' — different words, different length.
Think of it like...
Like a simultaneous interpreter at the UN who listens to an entire thought in one language and then reproduces it in another — the input and output do not need to match word-for-word.
Related Terms
Encoder-Decoder
An architecture where the encoder compresses input into a fixed representation and the decoder generates output from that representation. This structure is used in translation, summarization, and image captioning.
Transformer
A neural network architecture introduced in 2017 that uses self-attention mechanisms to process sequential data in parallel rather than sequentially. Transformers are the foundation of modern LLMs like GPT, Claude, and Gemini.
Attention Mechanism
A component in neural networks that allows the model to focus on the most relevant parts of the input when producing each part of the output. It assigns different weights to different input elements based on their relevance.
Machine Translation
The use of AI to automatically translate text or speech from one language to another. Modern neural machine translation uses transformer models and achieves near-human quality for many language pairs.