What are vanishing or exploding gradients?

Vanishing or exploding gradients are problems that happen when we teach neural networks how to learn, and it's like trying to whisper a secret across a very long hallway, sometimes the message gets too quiet or way too loud.

Why Gradients Matter

Imagine you're learning to ride a bike. Every time you wobble, someone gives you a hint about how to balance better. These hints are like gradients, they tell the neural network how to adjust its thinking so it can learn faster and better.

Now, if the hallway is really long, those hints might get lost along the way. If the message gets too quiet, that's a vanishing gradient, the bike rider doesn’t hear enough to keep improving. But if the message gets amplified too much, like an echo that grows louder and louder, it becomes an exploding gradient, the rider gets confused by all the noise.

A Real-Life Example

Think of stacking blocks one on top of another. If you're building a tower from the bottom up, each block helps support the ones above. But if you're trying to build from the top down and the blocks are too wobbly or too heavy, it's hard for the lower blocks to stay steady, just like gradients that vanish or explode.

So, vanishing or exploding gradients are like whispers or shouts in a long hallway, they can help or hurt learning.

Take the quiz →

Examples

Imagine a group of people passing a whisper from one to the next, by the time it reaches the end, it's barely audible (vanishing gradients).
A chain of loud echoes grows louder with each repetition until it becomes deafening (exploding gradients).
A child learning multiplication might forget earlier steps when trying to solve a complex problem.

Ask a question

Discussion

Recent activity

Categories: Science · deep learning· neural networks· machine learning