How are large language models like ChatGPT actually trained?

Large language models like ChatGPT are trained by learning from lots of examples, just like a kid learns to read by looking at many books.

Imagine you're teaching your little brother how to write sentences. You show him one sentence after another, "The cat sat on the mat," "The dog ran in the park." At first, he might not know what's going on, but as he sees more and more examples, he starts to notice patterns: words that often go together, how sentences begin and end.

That’s exactly what happens with large language models. They're shown millions of sentences, like all the books in a huge library. The model looks at each sentence and tries to guess what comes next. It keeps doing this over and over, learning from its mistakes, just like your brother gets better at writing as he practices.

How it learns

Think of the model as a super smart student who's always trying to figure out the best answer. Every time it makes a mistake, like guessing "The dog ran in the tree" instead of "park", it adjusts its thinking, getting closer and closer to understanding how language works.

After learning from all those examples, the model can write new sentences on its own, just like your brother can now make up his own stories!

Take the quiz →

Examples

A child learns to speak by hearing many sentences and repeating them.
A teacher shows a student thousands of math problems until they understand the patterns.
A dog learns tricks through repetition and rewards.

Ask a question

Discussion

Recent activity

Categories: Technology · large language models· ChatGPT· machine learning