Home/Glossary/Guardrails

AI Glossary

Guardrails

Rules and filters that prevent AI from harmful outputs

Definition

Guardrails are safety mechanisms applied to AI systems to prevent them from producing harmful, inappropriate, or off-topic outputs. They can be implemented through fine-tuning (constitutional AI, RLHF), system prompts, or post-generation filtering. All major AI APIs have built-in guardrails, and enterprise deployments often add custom guardrails for their specific use cases.

Related Terms

Back to Glossary