What are non-stationary bandits?

A non-stationary bandit is like a toy that changes its behavior every now and then, you never know what to expect next.

Imagine you're playing with a vending machine that gives out candy, but it doesn’t always give the same kind. You have to choose which button to press each time, hoping for your favorite treat. That's a bandit, something you try to get the most reward from by making choices.

When the Vending Machine Changes

Now imagine this vending machine sometimes changes what’s inside, maybe one day it gives chocolate, and the next day it gives gummy bears. This is like a non-stationary bandit, the best choice isn’t always the same, so you have to keep trying new things or adjust your strategy.

Why It Matters

In real life, this idea helps people make better choices when things are constantly changing, like picking which app to use or what route to take to school. You’re learning and adapting as you go, just like playing with a tricky vending machine!

Take the quiz →

Examples

A child choosing between different candy jars, where the amount of candy in each jar changes every day.
A person picking between various restaurants for lunch, but their favorite restaurant might change on weekends.
A student deciding which study method to use, knowing that the effectiveness of each method varies depending on the subject.

Ask a question

Discussion

Recent activity

Categories: Psychology · bandit problems· machine learning· decision theory