Concept

Token

The basic unit of text that AI models actually process - roughly a word or part of a word, and also the unit used to measure cost and limits when using AI.

When you type a message to an AI, it does not read your text the way you wrote it. Before the model sees anything, your text is broken up into small pieces called tokens. Common short words - "the," "is," "and" - are usually one token each. Longer or less common words get split into parts. The word "unbelievable" might become three tokens: "un," "believ," "able." Numbers, spaces, and punctuation are often tokens too.

The tokenisation process is done by a piece of software that has learned the most efficient way to split text, based on how frequently different character combinations appear. The goal is to represent language as concisely as possible without losing meaning. In practice, one token works out to roughly three quarters of a word in English. So a thousand-word document is approximately 1,300 tokens.

This matters practically in two ways. First, the cost of using AI is almost always calculated in tokens - you pay for the number of tokens you send in (your question plus any documents) and the number of tokens the model sends back (its response). Understanding token counts helps you understand and control AI costs. Second, the context window - the limit of how much the AI can hold in view at once - is also measured in tokens.

Tokens also explain some counterintuitive behaviours. If you ask an AI to count the number of times the letter "e" appears in a word, it sometimes gets this wrong. That is because the model never sees individual letters - it sees tokens. It is not working at the level of characters, which makes character-level tasks surprisingly difficult despite the model being able to write fluent prose.

For most users, tokens are something you notice mainly through pricing and limits rather than something you need to think about directly. But if you are building something on top of an AI - a product, an automated workflow, an assistant - understanding token economics becomes important for making things work at any meaningful scale.

Analogy

Musical notation. A piece of music exists in your head as a continuous experience, but to write it down and perform it, it gets broken into discrete notes - specific units that musicians can read and play. Tokens do the same thing for language: they break continuous text into discrete units that the model can process mathematically.

Real-world example

When companies use AI to process large volumes of customer emails, they often optimise carefully to reduce token usage. Shorter, cleaner prompts - trimming unnecessary words, reusing common instructions efficiently - can reduce costs by 30 to 40 percent with no loss in quality. At scale, that difference is significant.

Why it matters

Token pricing and context limits are the practical economics that shape what AI applications are possible to build. Anyone building seriously on top of AI needs to understand them - not in a technical sense, but in the same way a business needs to understand its unit costs.

In the news

Related concepts

Context Window Embeddings Inference

← Back to concepts