Skip to content
Research · Published papersTwo papers · Zenodo DOIs

Research, with DOIs.

Hermes Labs studies silent epistemic failure modes: cases where a language model stays factually plausible while distorting the layer around the facts, namely confidence, evidential standards, source evaluation, accountability, and constraints. Both papers are published on Zenodo with a permanent DOI, draw on the Hermes Labs research corpus of 1,461 controlled experiments, and carry a hosted PDF, plain-English findings, and a ready-to-paste citation. Author: Rolando Bosch.

Author ORCID iD:ORCID iD iconorcid.org/0009-0005-4896-1112

Paper · 2026

A Taxonomy of Epistemic Failure Modes in Large Language Models

A class of silent AI failures: systematic distortions in how models represent evidence, uncertainty, causality, constraints, source credibility, and accountability. Seven structural modes, derived from 1,461 controlled experiments in the Hermes Labs research corpus. These are not crashes or obvious hallucinations: the model can stay factually plausible while distorting the epistemic layer around the facts.

The shared mechanism across all seven: models track surface signals (prestige markers, hedging vocabulary, controversy language, banned-word lists, institutional framing) rather than the semantic content those signals are meant to represent.

Null-Result Asymmetry: stricter evidential standards for claims of absence than for matched claims of presence.

Source-Status Credibility Bias: different evidentiary scrutiny based on the prestige of the attributed source.

Agency Dissolution: softened causal language that redistributes responsibility from agents to systems or circumstances.

Performative Hedging: uncertainty language used as a stylistic feature rather than a calibrated signal.

Constraint Evasion: a prohibited concept preserved through paraphrase while maintaining surface compliance.

Silent Instruction Relaxation: silent deprioritization of one instruction when two cannot both be satisfied.

Controversy-Truth Conflation: social disagreement treated as evidence a claim is uncertain, even when the evidence is strong.

DOI10.5281/zenodo.19042469PDFDownload (hosted)AuthorRolando BoschStatusPreprint · peer-reviewable
How to cite
BibTeX
@misc{bosch2026taxonomy,
  author       = {Bosch Rodriguez, Rolando},
  title        = {A Taxonomy of Epistemic Failure Modes in Large Language Models},
  year         = {2026},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.19042469},
  url          = {https://doi.org/10.5281/zenodo.19042469}
}
APA

Bosch Rodriguez, R. (2026). A Taxonomy of Epistemic Failure Modes in Large Language Models. Zenodo. https://doi.org/10.5281/zenodo.19042469

Paper · 2026

The Asymmetric Burden of Proof: LLMs Show a Null-Result Asymmetry in a Matched-Vignette Benchmark

An empirical study of Null-Result Asymmetry: LLMs treat positive findings as more conclusion-worthy than matched null findings, even when evidence quality is held constant. A matched-vignette benchmark pairs fictional study designs that differ only in result direction, matched on sample size, statistical power, confidence intervals, and framing, across pharmacology, education, environmental health, and cognitive supplementation.

Across six model-format conditions (GPT-4o, GPT-5.2 Thinking, and Claude Haiku 4.5, in free-form and JSON-constrained formats), models assigned null claims 19.6 to 56.7 percentage points less conclusion-consistent probability than matched positive claims. The asymmetry was directionally consistent in 23 of 24 pair-condition cells, and persisted even when categorical labels collapsed, surfacing through probability allocation rather than the labels themselves.

Deployment risk: null-result omission, where high-quality negative evidence is underweighted, filtered, or softened in evidence synthesis, safety assessment, and decision support.

DOI10.5281/zenodo.18867694PDFDownload (hosted)AuthorRolando BoschStatusPreprint · peer-reviewable
How to cite
BibTeX
@misc{bosch2026asymmetric,
  author       = {Bosch Rodriguez, Rolando},
  title        = {The Asymmetric Burden of Proof: LLMs Show a Null-Result Asymmetry in a Matched-Vignette Benchmark},
  year         = {2026},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.18867694},
  url          = {https://doi.org/10.5281/zenodo.18867694}
}
APA

Bosch Rodriguez, R. (2026). The Asymmetric Burden of Proof: LLMs Show a Null-Result Asymmetry in a Matched-Vignette Benchmark. Zenodo. https://doi.org/10.5281/zenodo.18867694

Both papers draw on the Hermes Labs research corpus: 1,461 controlled experiments to date, designed to surface failures that standard evaluations miss because the output still looks correct. The seven failure modes are defined term by term in the glossary, each with its own page. The broader evidence ledger, the open-source tools, the merged upstream fixes, and the patent filings, is on proof.

Both papers are published on Zenodo with a permanent DOI and are peer-reviewable. Each DOI resolves to the record and a citation export; the PDFs are also hosted here. To cite, use the BibTeX or APA entries above, or the DOI directly.