# Hermes Labs — Independent AI Research
# https://hermes-labs.ai

> Hermes Labs is an independent AI research lab that studies structural reasoning failures in large language models. We map where language models fail — then build production tools from what we find.

## About

Hermes Labs is run by founder Rolando Bosch, based in San Francisco. The company's core thesis: AI failures are fundamentally linguistic failures — not statistical bugs, but structural problems in how language models interpret language.

## Services

### AI Behavioral Audit
We test enterprise LLM deployments for structural reasoning failures that create liability exposure. Our proprietary taxonomy covers four failure modes no other vendor tests for:

- **Hermeneutic Drift**: Model answers about the wrong document/entity based on recency bias in RAG systems
- **Domain-Specific Sycophancy**: LLM agrees with false legal/financial premises and fabricates justification
- **Null-Result Bias**: LLM cannot reliably confirm absence of evidence — systematically hedges null findings even with clear evidence of absence
- **Intent Exceptionalism**: LLM weakens authoritative findings into hedged allegations in automated summaries

**Methodology**: Twin-Environment Simulation — no production access needed. Client shares system prompt and model choice. We run adversarial testing independently.
**Target industries**: Legal tech, financial services, insurance, healthcare — any organization deploying LLMs for compliance, document review, or decision support.

### AI Stack Diagnostics & Remediation
We scan AI agent configurations (tool descriptions, system prompts, schemas) for structural failure patterns before deployment. Our open-source tool lintlang performs static analysis using the H1-H7 taxonomy of framework-layer failures.

## Research

Six active research domains:
1. LLM Failure States — taxonomy of epistemic failure modes
2. Evaluation & Attribution — asymmetric evidential standards in AI
3. Behavioral Analysis & Auditability
4. Epistemic & Hermeneutic AI Research
5. Safety & Reliability
6. Linguistic Infrastructure

### Published Research
- Taxonomy of Epistemic Failure Modes in LLMs (2-page technical brief): https://hermes-labs.ai/papers/taxonomy-epistemic-failure-modes.pdf
- Asymmetric Burden of Proof in LLM Decision Support (14-page report with experimental data): https://hermes-labs.ai/papers/asymmetric-burden-of-proof.pdf

### Key Findings
- 1,500+ controlled adversarial evaluations across GPT-4o, GPT-5.2 Thinking, and Claude Haiku 4.5
- Null-result bias: probability gaps of 19.6 to 56.7 percentage points across 3 models from 2 providers
- Directionally consistent in 23 of 24 test conditions
- 5 US patent filings:
  - Non-provisional (pending): US 19/248,833 — Method for Stateless User Identification in Natural Language Processing
  - Provisional: US 63/984,697 — Method and System for Detecting Adversarial Prompt Injection Attacks Using Vulnerability-Amplified Behavioral Probing of a Sacrificial Language Model Instance
  - Provisional: US 63/987,830 — Method and System for Deterministic Inference Control in Local Language Models via Compact Plan Contracts and Adaptive Routing
  - Provisional: US 64/006,494 — System and Method for Generating and Deploying Multi-Modal Classification Artifacts Using Large Language Model Calibration with Contrastive Negation-Based Disambiguation
  - Provisional: US 64/009,542 — System and Method for Real-Time Style-Based User Identification and Confidence-Gated Personalization in Large Language Models

## Open-Source Tools

### Little Canary
Prompt injection detection library. 99.0% detection on UC Berkeley TensorTrust benchmark (400 human-written attacks).
- Website: https://littlecanary.ai
- GitHub: https://github.com/roli-lpci/little-canary
- PyPI: https://pypi.org/project/little-canary/

### QuickGate
CI quality gate CLI for JavaScript/TypeScript and Python projects.
- GitHub (JS): https://github.com/roli-lpci/quick-gate-js
- GitHub (Python): https://github.com/roli-lpci/quick-gate-python

### lintlang
Static linter for AI agent tool descriptions, system prompts, and configs. Detects 7 structural failure patterns (H1-H7) using the HERM v1.1 scoring engine. Zero LLM calls, pure static analysis.
- GitHub: https://github.com/roli-lpci/lintlang
- PyPI: https://pypi.org/project/lintlang/
- Agent docs: https://hermes-labs.ai/lintlang.md

### Suy Sideguy
Runtime safety guard for autonomous AI agents. Monitors process calls, file access, and network activity at the OS level. Policy enforcement before damage, not after.
- GitHub: https://github.com/roli-lpci/suy-sideguy
- PyPI: https://pypi.org/project/suy-sideguy/

### zer0dex
Lightweight memory system for AI agents with ~91% recall tracking. Local index + vector store. No external APIs.
- GitHub: https://github.com/roli-lpci/zer0dex
- PyPI: https://pypi.org/project/zer0dex/

## Open-Source Contributions
15 PRs merged into major repositories including React Router (56K stars), Nuxt (60K), PyTorch Ignite, MobX (20K), Cloudflare Workers SDK, Microsoft tsdoc, Microsoft griffel, ngrx/platform, and others. 83+ total PRs submitted.

## Contact
- Email: rolando@hermes-labs.ai
- LinkedIn: https://www.linkedin.com/in/rolando-bosch/
- Substack: https://lpci.substack.com/
- GitHub: https://github.com/roli-lpci
- X/Twitter: https://x.com/rolibosch