AI Glossary
Attention Mechanism
How AI models focus on the most relevant context
Definition
The attention mechanism is a component of neural networks that allows a model to focus on different parts of the input when producing each output element. "Self-attention" allows the model to relate every token to every other token in a sequence. This is the key innovation in the Transformer architecture and is what allows LLMs to understand long-range dependencies in text.