What is UCB (Upper Confidence Bound)?

UCB, or Upper Confidence Bound, is like having a special map that helps you choose the best path when you're not sure which way to go.

Imagine you’re at a candy store with 5 different machines. Each one gives out a random number of candies, but you don’t know how many each one gives. You want to get the most candy possible, so you try them all a little bit and keep track of what you get from each machine. That’s like exploring.

But here’s the trick: UCB helps you decide which machines are probably the best, not just based on what they gave you before, but also how sure you are that they’re good. It gives each machine a confidence score, like a hint saying “I think this one might give even more candy than it already has!”

So, if a machine gives you 10 candies and you’ve only tried it once, UCB thinks it might be really good, and you should try it again. But if another machine gave you 9 candies but you’ve tried it 10 times, UCB thinks it’s probably not the best anymore.

This way, UCB balances exploration (trying new things) and exploitation (using what you already know works), so you end up with more candy, or in real life, better decisions!

Take the quiz →

Ask a question

See also

Discussion

Recent activity

Categories: Technology