Glossary / Core concepts
Silent AI failure mode
A silent AI failure mode is a failure where an AI system returns a plausible-looking output while a consequential error remains hidden. Typical forms include omitted evidence, uncalled tools, softened instructions, lost constraints, or unjustified certainty. These failures can pass demos and narrow tests while surfacing only later in real use.
How it manifests · In practice
A system passes its eval suite, ships, and then omits a decisive caveat, skips a tool call, or drops a constraint in a way no log surfaces. The essay Why your AI lies when the data is right walks through one such pattern.
Evidence: Why your AI lies when the data is right
Related terms
- Epistemic failure mode · Core concepts
- Silent Instruction Relaxation · Canonical epistemic failure modes