Skip to content
Proof · Technical recordLast revised 23 Apr 2026

Evidence, with citations.

Every claim on this site is anchored to a filing number, a DOI, a merged pull request, or a public repository. This page is the citation ledger. It is structured the way a procurement auditor would want to read it.

Paper · 2025

Taxonomy of Epistemic Failure Modes in Large Language Models

A structured account of how language models fail in ways standard evaluations miss: scalar confidence inflation, validation loops, hermeneutic drift, sycophancy, null-result discounting, and intent exceptionalism.

DOI10.5281/zenodo.19042469FormatPDF · 34 pp.
Paper · 2025

The Asymmetric Burden of Proof: How Language Models Systematically Discount Negative Findings

Empirical investigation of a bias we see regularly in audits: models apply stricter evidential standards to null or negative claims than to positive ones, producing unreliable guidance in compliance and decision-support contexts.

DOI10.5281/zenodo.18867694FormatPDF · 22 pp.
Non-provisional · pending

Method for Stateless User Identification in Natural Language Processing

The non-provisional application; the provisionals below cover adjacent mechanisms for adversarial probing, deterministic inference, multi-modal classification calibration, and confidence-gated personalization.

ApplicationUS 19/248,833StatusPending
Provisional

Detecting Adversarial Prompt Injection via Vulnerability-Amplified Behavioral Probing

StatusFiled
Provisional

Deterministic Inference Control in Local Language Models via Compact Plan Contracts and Adaptive Routing

StatusFiled
Provisional

Multi-Modal Classification Artifacts with LLM Calibration + Contrastive Negation-Based Disambiguation

StatusFiled
Provisional

Real-Time Style-Based User Identification and Confidence-Gated Personalization in LLMs

StatusFiled
LangChain · 135k ★

Fixed thinking + tools crash in ChatAnthropic.bind_tools()

Anthropic’s API rejects tool_choice when extended thinking is enabled. LangChain’s bind_tools() was force-setting it on every call, returning 400 for any agent combining Claude thinking with tool use. Ported the guard the structured-output path already had.

Pull requestlangchain/pull/35544StatusMerged
Microsoft Semantic Kernel · 28k ★

Fixed ChatHistoryTruncationReducer silently deleting system prompts

The truncation reducer was calling extract_range, a summarization helper that filters out system and developer messages. System prompts were being dropped with no warning. Ported Microsoft’s own .NET fix to the Python SDK.

Pull requestsemantic-kernel/pull/13610StatusMerged

Additional 24 merged contributions across PyTorch Ignite, Optuna, React Router (56k ★), Nuxt (60k ★), MobX (20k ★), Cloudflare Workers SDK (6k ★), Sentry, ngrx, TSDoc, and others (maintenance and typing modernization). Full list on GitHub.

Sealed evidence bundles are cryptographic chain-of-custody packages with per-finding provenance. They are verifiable offline, against a public key, without network access or trust in the Hermes infrastructure. The verifier returns one of two verdicts: authentic or tampered.

Deployed in active engagements. Evidence structure cross-maps to:

  • EU AI Act · Annex IV Technical documentation
  • ISO / IEC 42001 AI management system evidence
  • NIST AI RMF 1.0 Govern · Map · Measure · Manage
  • SOC 2 AI addendum Control evidence for trust-services criteria
Ask about the preview verifier