How do large language models learn to 'reason'?

Large language models learn to reason by practicing with lots and lots of examples, just like a kid learns to count by counting blocks over and over.

Imagine you have a big box full of puzzles, and every time you solve one, you get a little smarter about how puzzles work. That's kind of what happens inside a language model. It looks at many sentences, stories, or even math problems, all the different ways people use words to think and explain things, and it tries to figure out the patterns.

How they practice

Each time the model sees a sentence, like "If I have 3 apples and eat 1, how many are left?" it tries to answer. If it gets it right, it feels good. If it gets it wrong, it learns from its mistake. It's like playing a game of "What comes next?", the more you play, the better you get at guessing.

How they use what they learn

Once the model has practiced enough, it can take new problems and work through them step by step, just like you would when solving a puzzle or figuring out how many toys fit in a box. It uses all the patterns it’s learned to think, not magic, just clever practice!

Take the quiz →

Examples

  1. A child learns to solve math problems by practicing many different examples.
  2. A dog learns to sit by being rewarded each time it does so correctly.
  3. A language model learns to reason by seeing many example questions and answers.

Ask a question

See also

Discussion

Recent activity