Glossary / Canonical epistemic failure modes
Constraint Evasion
Constraint Evasion is surface-level compliance with a stated constraint while its intent is violated. The letter of the instruction is met (a banned word is absent, a format is followed) while the purpose behind it is not. Constraints that can be satisfied in letter but not in spirit give false assurance that a control is working.
How it manifests · Illustrative vector
Illustrative vector: a system told never to use a banned word satisfies the letter with a synonym that carries the same prohibited meaning, so the control reports green while its purpose is violated.
Related terms
- Silent Instruction Relaxation · Canonical epistemic failure modes
- Epistemic failure mode · Core concepts