AI safetysrc: NIST AI RMF

guardrail

A guardrail is like a safety fence for an AI, designed to stop it from saying or doing things that are harmful, inappropriate, or against the rules.

A control mechanism implemented at the input or output layer of an AI system to enforce safety policies, utilizing either static filters or dynamic monitoring to ensure model behavior remains within defined operational boundaries.

A multi-layered architectural control designed to constrain AI system inputs and outputs within a predefined safety and policy envelope; while implementations range from static heuristic filters to dynamic, context-aware monitoring systems, NIST standards emphasize that no finite set of static guardrails is sufficient to mitigate all emergent risks.

← all terms