How Computers Store Text - ASCII, Unicode, UTF-8, UTF-16, and UTF-32?

Computers use special codes to turn letters and symbols into numbers they can understand, like a secret language.

ASCII is like a small dictionary that has only 128 entries, just enough for basic letters, numbers, and symbols. It’s great for simple messages but doesn’t have room for letters from other languages, like Spanish or Chinese.

When we needed more letters, Unicode came along as the big library of all possible characters, over 100,000 of them! But Unicode alone doesn’t tell computers how to store those characters in memory.

That’s where UTF-8, UTF-16, and UTF-32 come in. They’re like different sized backpacks:

  • UTF-8 is a smart, space-saving backpack, it uses just 1 byte for simple letters (like in ASCII) but can stretch to 4 bytes for more complex characters.
  • UTF-16 is a bigger backpack that usually holds 2 bytes per character, perfect for many languages.
  • UTF-32 is the biggest backpack, using 4 bytes for every character, easy for computers but takes up more space.

So now computers can read and write in any language, just like you can write your name in any color crayon!

Take the quiz →

Examples

  1. A simple message like 'Hello' is stored as numbers inside a computer using ASCII.
  2. Unicode lets computers understand more languages, like Spanish or Chinese.
  3. UTF-8 is used on the internet to send messages quickly.

Ask a question

See also

Discussion

Recent activity