AI Glossary
Token / Tokenization
The basic unit of text that AI models process
Definition
A token is the basic unit that language models process — roughly 3-4 characters or about 0.75 words in English. "Tokenization" is the process of splitting text into these chunks before feeding it to a model. Tokens determine pricing (most APIs charge per token), context window limits, and model speed. A 4,000-token context window fits roughly 3,000 words.