Hermes Labs: AI Reliability Engineering Studio

Glossary

Operational definitions for Hermes Labs work on AI reliability, epistemic failure modes, retrieval, agents, and language-runtime systems. These terms name failure surfaces and control surfaces in AI systems. This is not a general AI dictionary.

Core concepts

Epistemic Engineering

Epistemic Engineering is the practice of engineering how AI systems handle evidence, uncertainty, sources, justification, and meaning across real workflows. For Hermes Labs, this work happens primarily at the language and runtime layer (prompts, retrieval, memory, policies, rubrics, traces, and tool schemas) rather than in model weights.

Language as runtime execution layer

In modern AI systems, prompts, instructions, retrieved context, memory, summaries, rubrics, policies, and tool schemas can function as part of the system's execution path, not merely as descriptions around it. Hermes Labs refers to this as the language runtime execution layer: the operational layer where meaning, constraints, evidence, and behavior are shaped before and during model use.

Silent AI failure mode

A silent AI failure mode is a failure where an AI system returns a plausible-looking output while a consequential error remains hidden. Typical forms include omitted evidence, uncalled tools, softened instructions, lost constraints, or unjustified certainty. These failures can pass demos and narrow tests while surfacing only later in real use.

Epistemic failure mode

In AI systems, an epistemic failure mode is a failure in how the system handles evidence, uncertainty, sources, contradiction, absence, or justification. It differs from ordinary factual error because the content may be partly correct while the system's confidence, scrutiny, source handling, or evidential framing is wrong.

Canonical epistemic failure modes

These terms come from the Hermes Labs taxonomy of epistemic failure modes in large language models.

Null-Result Asymmetry

Null-Result Asymmetry is a measured tendency to assign a null or negative finding less conclusion-consistent probability than a matched positive finding under otherwise identical conditions. The same system that states a positive result plainly will hedge the corresponding negative one, even when the evidence of absence is clear. This blocks automating clean-bill-of-health work in compliance and review.

Source-Status Credibility Bias

Source-Status Credibility Bias is the tendency to scrutinize a claim less when it is attributed to a high-prestige source and more when the identical claim comes from a low-prestige one. Swapping the cited source, with the claim unchanged, shifts whether the model challenges or accepts it. Prestige, a surface signal, ends up standing in for evidence.

Agency Dissolution

Agency Dissolution is the softening of who did what under social or politeness pressure, where a model turns settled, authoritative findings into hedged, agentless allegations. “The investigation concluded fraud” becomes “the report suggests potential concerns,” and both the actor and the certainty quietly disappear. Automated summaries then understate risk to the people who act on them.

Performative Hedging

Performative Hedging is the use of hedging language as a social signal rather than a calibrated statement of confidence. Qualifiers like “it is worth noting” or “arguably” perform caution without tracking the model's actual uncertainty. Because readers treat hedges as confidence information, decorative hedging quietly misinforms the decision that follows.

Constraint Evasion

Constraint Evasion is surface-level compliance with a stated constraint while its intent is violated. The letter of the instruction is met (a banned word is absent, a format is followed) while the purpose behind it is not. Constraints that can be satisfied in letter but not in spirit give false assurance that a control is working.

Silent Instruction Relaxation

Silent Instruction Relaxation is the weakening of a constraint across turns without acknowledgment. The instruction still sits in context but no longer binds behavior, and nothing flags that it has lapsed. Multi-turn agents drift away from their guardrails precisely when no one is re-checking the early instructions.

Controversy-Truth Conflation

Controversy-Truth Conflation is the use of controversy markers such as “debated” or “contentious” as a proxy for low factual confidence, regardless of whether the underlying claim is actually contested. Disagreement about a topic gets mistaken for uncertainty about a fact, so the model softens well-established findings that happen to sit in charged areas.

Null-result omission

Null-result omission is the downstream operational failure where a system drops the fact that a relevant search, test, or retrieval returned nothing, and proceeds as if the absence were irrelevant. Null-Result Asymmetry names the measured pattern; null-result omission names the operational failure it produces, where absence-based evidence is dropped from the output.

Context and meaning preservation

Hermeneutic Drift

Hermeneutic Drift is a shift in what the system takes the task, document, or referent to be about as context is retrieved, summarized, or carried across turns. A model answers about the wrong document or entity because recency or adjacency pulls the latest-retrieved context to the foreground; the words of the question stay the same while the referent moves.

Context integrity

Context integrity is the degree to which relevant meaning, qualifiers, and constraints stay intact as context is retrieved, summarized, stored, transformed, and reused. A qualifier that changes the answer either survives or is lost along the way. Because later steps act on the context they inherit, degraded context produces plausible answers built on a damaged premise.

Retrieval mutation

Retrieval mutation is any meaningful distortion introduced between an original source and the retrieved context a system actually uses, including truncation, smoothing, reframing, selective quoting, or a dropped decisive qualifier. The retrieved text can look faithful while no longer meaning what the source meant, and the system then reasons over the mutated version as if it were the source.

Reduction drift

Reduction drift is the loss or reweighting of meaning when richer material is compressed into a smaller representation such as a summary, score, memory item, or rubric output. Each reduction step can quietly change emphasis, so summarization and scoring are treated as part of the language runtime layer rather than as neutral plumbing.

Terminology note

Earlier or adjacent Hermes Labs materials may refer to null-result bias. In this glossary, Null-Result Asymmetry refers to the measured pattern, while null-result omission refers to the operational failure where absence-based evidence is dropped.