Home/Glossary/Transformer Architecture

AI Glossary

Transformer Architecture

The breakthrough design that powers modern AI

Definition

The Transformer is a neural network architecture introduced in the 2017 paper "Attention Is All You Need." It uses self-attention mechanisms to process sequences in parallel, which made large-scale language model training practical. Nearly all modern LLMs — GPT, Claude, Gemini, Llama — are Transformer-based. The architecture has also been adapted for images, audio, and multimodal tasks.

Related Terms

Back to Glossary