Q-learning is a way for something to learn how to make the best choices by trying different things and seeing what works.
Imagine you're playing a game where you have to choose between two paths in a forest, one might lead to candy, the other to a silly monster. You don't know which one is better at first, so you try both. If you pick the path with candy, you feel happy; if you pick the one with the silly monster, you laugh and maybe get a little scared. Over time, you start remembering which choices gave you the best results.
Q-learning works like this, but instead of you choosing paths in a forest, it's something like a robot or a video game character learning how to win by trying different actions and seeing what happens next.
How Q-learning learns
Every time the learner (like our robot) makes a choice, it keeps track of how good that choice was. It’s like having a little notebook where it writes down: “If I picked action X in situation Y, I got Z points.” The more it plays or tries things out, the better it gets at picking actions that lead to the best results, just like you learn which path to pick when playing the forest game again and again.
Ask a question
See also
- Why Do We Get 'The Runs' on Planes?
- How Does a Fridge Keep Food Cool?
- How Does a Smartphone Recognize Your Face?
- How Did the Internet Begin?
- Why Do We Use Passwords for Security?