AI Glossary
Constitutional AI
Training AI to be helpful, harmless, and honest via rules
Definition
Constitutional AI (CAI) is an alignment technique developed by Anthropic where a set of principles (a "constitution") guides the AI's self-evaluation during training. Instead of relying solely on human feedback, the model critiques its own outputs against these principles and revises them. CAI is one of the techniques used to make Claude's behavior more consistent and safer.