Residual Connection
A shortcut that allows the input to a layer to bypass one or more layers and be added directly to the output. This enables training of much deeper networks by ensuring gradient flow.
Why It Matters
Residual connections (skip connections) made very deep networks practical. They are a fundamental component of modern transformers, ResNets, and virtually every deep architecture.
Example
In a ResNet, the input to a block is added to the block's output: output = F(x) + x. If the block learns nothing useful, the network still passes the original signal through.
Think of it like...
Like an express elevator that skips floors — it gives information a direct path through the network, preventing it from getting lost or diluted across many layers.
Related Terms
Vanishing Gradient Problem
A training difficulty in deep networks where gradients become exponentially smaller as they are propagated back through many layers, making it nearly impossible for early layers to learn.
Deep Learning
A specialized subset of machine learning that uses artificial neural networks with multiple layers (hence 'deep') to learn complex patterns in data. Deep learning excels at tasks like image recognition, speech processing, and natural language understanding.
Transformer
A neural network architecture introduced in 2017 that uses self-attention mechanisms to process sequential data in parallel rather than sequentially. Transformers are the foundation of modern LLMs like GPT, Claude, and Gemini.