If you've used ChatGPT, Claude, or Gemini, you've probably seen the word "token" thrown around without explanation. It shows up in pricing pages, error messages, and developer docs. Here is what tokens actually mean, in plain English.
A token is a small chunk of text that AI language models read and write. Think of it as a syllable or piece of a word, not a letter and not a full word. Common short words ("the", "and", "is") are usually 1 token. Longer words get split into 2-5 tokens.
Roughly: 1 token = 4 characters = 0.75 words in English. So 100 words is about 130 tokens. 1,000 tokens is about 750 words. A typical novel is about 100,000 tokens.
Language models don't read text the way humans do. They process chunks of text mathematically — each chunk gets converted to a number, the number gets fed through a neural network, and the network produces a prediction for the next chunk.
The "chunks" are tokens. They're chosen during training to balance two things:
The result is a vocabulary of 50,000 to 200,000 tokens that covers common words as single units and breaks down rare words into smaller pieces.
See exactly how text gets tokenized.
Open Token Counter →| Word | Tokens | Notes |
|---|---|---|
| the | 1 | Common short word |
| hello | 1 | Common medium word |
| hamburger | 3 | ham + burg + er |
| unbelievable | 3 | un + believ + able |
| supercalifragilistic | 7 | Rare, broken into pieces |
| hippopotamus | 3 | hip + popot + amus |
| 😊 | 1 | Emoji = 1 token |
| 1234567890 | 4 | Numbers split into pieces |
Tokenization is not random. The pieces are learned from training data — common patterns get one token, uncommon patterns get split.
Three reasons developers and AI users care about tokens:
1. Cost. AI APIs charge per token. Every prompt and every response costs money based on token count. Understanding tokens lets you predict and control your bill.
2. Context window limits. Each AI model has a maximum number of tokens it can process in one request. Send more than the limit and the request fails. GPT-4o limit: 128,000 tokens. Claude limit: 200,000 tokens. Gemini limit: 1-2 million tokens.
3. Speed. Models process tokens one at a time. More tokens = more processing time. A short prompt responds in seconds; a very long prompt can take a minute or more.
Most APIs charge separately for input tokens and output tokens. Output is usually more expensive (3-5x) because generating text is computationally harder than reading it.
| Model | Input ($/M tokens) | Output ($/M tokens) |
|---|---|---|
| GPT-4o mini | $0.15 | $0.60 |
| GPT-4o | $2.50 | $10.00 |
| Claude Sonnet 4 | $3.00 | $15.00 |
| Claude Opus 4 | $15.00 | $75.00 |
| Gemini 2.5 Flash | $0.15 | $0.60 |
| Gemini 2.5 Pro | $1.25 | $10.00 |
"$/M tokens" means dollars per million tokens. So GPT-4o at $2.50/M input means a million input tokens cost $2.50. A typical chat message is 100-500 input tokens — fractions of a cent.
A typical ChatGPT-style chat message:
Cost on different models:
| Model | Cost per message | Cost per 1,000 messages |
|---|---|---|
| GPT-4o mini | $0.000270 | $0.27 |
| GPT-4o | $0.00450 | $4.50 |
| Claude Sonnet 4 | $0.00615 | $6.15 |
| Gemini 2.5 Flash | $0.000270 | $0.27 |
| Claude Opus 4 | $0.03075 | $30.75 |
For personal use, AI is essentially free on the cheap models. For production at scale (millions of requests), the difference between cheap and premium adds up to thousands per month.
Two ways:
For most uses, the online counter is fine. For exact billing or production code, use the official tokenizer.
A context window is the maximum tokens an AI model can process in one request. It includes:
All of these share the same window. If GPT-4o has a 128K window, the total of all five must fit under 128K. Send more and you get an error.
This is why long conversations sometimes "forget" earlier messages — the conversation history grew too long to fit in the window, so older messages got dropped.
If you remember those three things, you can navigate AI pricing and limits without confusion.
See tokens in action. Paste any text and see exact counts.
Open Token Counter →