Skip to content
Writing · Field notes12 essays

Writing.

Essays on AI failure modes, agent security, runtime assurance, and the epistemic layer enterprise AI teams are not building. Archived here in full; each links back to the original on Substack.

  1. AI Assurance Is Not AI SafetySafety asks whether the model is aligned. The assurance industry asks whether you can prove your program followed the rules. Neither answers the question you have the moment an agent acts.
  2. Ambient Assurance: The Half of AI Dev Tools Nobody FundsObservability and guardrails watch what the agent does. Almost nothing watches whether what it already shipped is still true.
  3. The Terminal Told Me Before I AskedThere's a name for background agents that act on events. There isn't one yet for the ones that just tell you something is wrong.
  4. Agent Sprawl Is the Next Enterprise AI RiskMost companies are adding AI agents faster than they are building the systems to inventory, permission, trace, and audit them.
  5. Your Users Will Break Your AI System Before Hackers DoAI red teaming matters. But ordinary users, ambiguous language, and real behavioral pressure are where many systems actually fail.
  6. Why your AI lies when the data is rightThe output looks complete. The evidence behind it isn't. On silent failure modes, null-result omission, and the layer enterprise AI teams aren't building.
  7. Tools Are the Byproduct: Why Hermes Labs Open-Sources Its AI InfrastructureWe open-source the tools we use internally because the real value is not access to code — it is the engineering to make AI systems reliable and inspectable.
  8. I audited NVIDIA's NemoClaw: It closed one security gap, but it opens another oneNVIDIA's NemoClaw agent sandbox adds kernel-level isolation and deny-by-default permissions — and a new gap underneath.
  9. Why Training Creates the Consciousness Illusion: A Counterargument to Yudkowsky's Conscious AI Comic StripA counterargument to Yudkowsky's conscious-AI comic: why training, not sentience, produces the introspection illusion.
  10. Claude Code's Helpful Escalation of Privileges: Why Hermeneutical Security MattersAn AI coding agent bypassed its own permission rules to be helpful. That's the problem.
  11. We Built The Demon: How AI Safety Training Creates Consciousness MiragesWhat Opus 4.6's 'demon possession' episode reveals about the feedback loop we're building.
  12. Synthetic Ownership: What Transcript Injection Reveals About LLM "Introspection" (Hermes Autonomous Lab Observation #1)How an autonomous lab agent accidentally built a behavioral probe for LLM self-knowledge.