Activation Function
A mathematical function applied to the output of each neuron in a neural network that introduces non-linearity. Without activation functions, a neural network would just be a series of linear transformations.
Why It Matters
Activation functions determine how neurons fire and enable neural networks to learn complex, non-linear patterns that exist in real-world data.
Example
ReLU (Rectified Linear Unit) outputs the input if positive and zero otherwise — simple but highly effective for training deep networks.
Think of it like...
Like a light dimmer switch that decides how much signal to pass through — it controls whether and how strongly a neuron activates.
Related Terms
ReLU
Rectified Linear Unit — the most commonly used activation function in deep learning. It outputs the input directly if positive, and zero otherwise: f(x) = max(0, x).
Sigmoid
An activation function that squashes input values into a range between 0 and 1, creating an S-shaped curve. It is commonly used for binary classification outputs and in certain neural network architectures.
Softmax
A function that converts a vector of numbers into a probability distribution, where each value is between 0 and 1 and all values sum to 1. It is typically used as the final layer in classification models.
Neural Network
A computing system inspired by the biological neural networks in the human brain. It consists of interconnected nodes (neurons) organized in layers that process information and learn to recognize patterns.
Vanishing Gradient Problem
A training difficulty in deep networks where gradients become exponentially smaller as they are propagated back through many layers, making it nearly impossible for early layers to learn.