Home/Glossary/Token / Tokenization

AI Glossary

Token / Tokenization

The basic unit of text that AI models process

Definition

A token is the basic unit that language models process — roughly 3-4 characters or about 0.75 words in English. "Tokenization" is the process of splitting text into these chunks before feeding it to a model. Tokens determine pricing (most APIs charge per token), context window limits, and model speed. A 4,000-token context window fits roughly 3,000 words.

Related Terms

Back to Glossary