Machine Learning

Residual Connection

A shortcut that allows the input to a layer to bypass one or more layers and be added directly to the output. This enables training of much deeper networks by ensuring gradient flow.

Why It Matters

Residual connections (skip connections) made very deep networks practical. They are a fundamental component of modern transformers, ResNets, and virtually every deep architecture.

Example

In a ResNet, the input to a block is added to the block's output: output = F(x) + x. If the block learns nothing useful, the network still passes the original signal through.

Think of it like...

Like an express elevator that skips floors — it gives information a direct path through the network, preventing it from getting lost or diluted across many layers.

Related Terms

Vanishing Gradient Problem

A training difficulty in deep networks where gradients become exponentially smaller as they are propagated back through many layers, making it nearly impossible for early layers to learn.

A specialized subset of machine learning that uses artificial neural networks with multiple layers (hence 'deep') to learn complex patterns in data. Deep learning excels at tasks like image recognition, speech processing, and natural language understanding.

Transformer

A neural network architecture introduced in 2017 that uses self-attention mechanisms to process sequential data in parallel rather than sequentially. Transformers are the foundation of modern LLMs like GPT, Claude, and Gemini.

Back to Glossary