How do large language models predict the next word?

Large language models guess what comes next by looking at patterns they’ve learned from reading lots of words.

Imagine you're telling a story to your friend. You say, "Once upon a time, there was a boldcat." Your friend might say the next word is "boldmouse**," because that's a common pairing in stories. That’s how language models think, they look at what came before and choose what fits best.

Like a super smart detective

A large language model acts like a super smart detective who has read millions of books. When it sees "Once upon a time, there was a," it checks its memory to see which words usually come after that phrase. It doesn’t just pick one, it picks the most likely one based on how often it saw that word in similar situations.

Like playing with building blocks

Think of each sentence as a row of building blocks. The model looks at the blocks already placed and chooses the best next block to add. Sometimes it might even guess a few blocks ahead, like a game of "I Spy" where you try to figure out what comes next.

That’s how large language models predict the next word, by using patterns they’ve learned from lots of reading!

Take the quiz →

Examples

  1. A child learns to speak by noticing patterns in the words they hear, like how 'the' often comes before a noun.
  2. Imagine guessing the next word in a story based on what you've read so far, that's similar to what language models do.
  3. When typing an email, your phone might suggest the next word you want to write because it knows common phrases.

Ask a question

See also

Discussion

Recent activity

Nothing here yet.