Machine Learning

Gradient Clipping

A technique that caps gradient values at a maximum threshold during training to prevent exploding gradients. If a gradient exceeds the threshold, it is scaled down.

Why It Matters

Gradient clipping is a simple but essential safeguard in training deep networks. It is applied by default in most modern training frameworks.

Example

Setting a gradient clip norm of 1.0 so that if any gradient vector has a magnitude greater than 1.0, it is scaled down to 1.0 while maintaining its direction.

Think of it like...

Like a speed limiter on a car — it lets you drive normally but prevents dangerous speeds, keeping the system stable.

Related Terms