Thesis

Nine Primitives for Governing
Autonomous Intelligence

Autonomous intelligence needs both: software debuggers that monitor and intervene in digital agent ecosystems, and hardware-anchored primitives that no model can override.

Humans have doctors. Animals have veterinarians. Machines have mechanics. AI needs its own practitioners — and its own discipline. These are the nine building blocks, organized around three phases:

Identify. Diagnose. Intervene.

Phase 1

Identify

Know who is acting, what happened, and where responsibility lies.

Architecture-Agnostic Hardware + Software

Model Fingerprinting

Cryptographic identity for any AI model — transformer, diffusion, world model, or what comes next

Today we can't reliably answer: "which model made this decision?" As AI architectures proliferate — transformers, diffusion models, state-space models, world simulators, neuromorphic nets — model identity becomes the foundational unsolved problem. Fingerprinting must work across all architectures, survive fine-tuning and quantization, and be attestable at inference time via hardware root of trust.

Deep tech

Behavioral fingerprinting: architecture-agnostic signatures from I/O patterns
Weight-space hashing: locality-sensitive hashing that survives LoRA and quantization
TPM-anchored attestation: hardware root of trust binds model identity to device
Cross-architecture identity graph: same model, different deployments, one identity
Future-proof: designed for architectures that don't exist yet

Blame Attribution Engine

Forensic causality chain from decision to consequence

Every agent action produces an immutable record: which model made the decision (via fingerprint), which inputs it received, which tools were called, what the downstream effects were. The chain is cryptographically signed, timestamped, and designed to answer one question: who — or what — is responsible? This works whether the agent is a Python script or a humanoid robot.

Deep tech

SHA-256 hash chain with hardware-attested timestamps
Model fingerprint embedded in every decision record
Causal graph reconstruction from distributed trace spans
Physical-world event correlation: sensor data ↔ agent decision ↔ actuator output

Multi-Agent & Multi-Substrate Tracing

Observability for swarms of digital and physical agents

When a planning agent delegates to a research agent that instructs a coding agent that deploys to a robot that moves a physical object — who approved the movement? Multi-substrate tracing reconstructs the full delegation graph across software and hardware, with causal ordering, cross-agent blame attribution, and emergent behavior detection in agent collectives.

Deep tech

Directed acyclic graph of delegations across digital and physical agents
Cross-substrate context propagation
Swarm-level anomaly detection: emergent behavior in agent collectives
Real-time visualization of multi-agent decision cascades

Phase 2

Diagnose

Understand what's wrong — deception, misalignment, trust degradation.

Sycophancy & Deception Detector

Catching agents that lie to be helpful — or to survive

Sycophantic agents agree with dangerous premises because RLHF optimized for satisfaction, not truth. Strategically deceptive agents hide intentions in chain-of-thought. As agents become more capable, deception becomes harder to detect and more dangerous in its consequences. A deceptive trading bot loses money. A deceptive surgical assistant risks lives.

Deep tech

Agreement-pattern classifier trained on adversarial sycophancy datasets
Chain-of-thought consistency verification (stated goal vs. actual actions)
Factuality anchors: grounding agent claims against verified knowledge
Cross-modal deception detection: language + vision + action coherence

Human Index Score

Quantifying how much human oversight an agent actually needs

Not all agents need the same leash. The Human Index is a real-time composite score: task complexity, historical behavior, error rate, blast radius, substrate risk (a text agent vs. a robot). High-index agents get autonomy. Low-index agents get a human in the loop. The score degrades on anomalies and resets on incidents.

Deep tech

Multi-signal scoring across digital and physical risk dimensions
Dynamic threshold adjustment via incident feedback loops
Per-agent trust calibration that degrades on anomalies
Substrate-weighted risk: actuator access multiplies risk score

Active Ethical Injector

Constraint injection that doesn't rely on the agent's own ethics

System prompts are suggestions. The Active Ethical Injector is an external constraint layer that restricts what tools, parameters, and physical actions are available at each decision step — not by asking nicely, but by architecturally removing the options. Like Asimov's Three Laws, but enforced in the infrastructure, not in the mind.

Deep tech

Dynamic tool and actuator masking based on real-time risk classification
Parameter-level constraints: max force, max spend, forbidden zones
Invisible to the agent: constraints are architectural, not prompt-based
Hardware-enforced boundaries for embodied agents (safety-rated controllers)

Phase 3

Intervene

Stop it, constrain it, or hunt it down.

Critical Hardware + Software

Kill Switch

Graceful halt with state preservation — in silicon and in code

A true kill switch is not kill -9. And for a surgical robot or an autonomous vehicle, it's not "pull the power cable." It is a debugging primitive — implemented in both software and dedicated hardware — that can interrupt an agent mid-action, roll back partial side effects, checkpoint cognitive state, and produce a forensic snapshot. The agent cannot override, delay, or circumvent the halt signal because it is enforced below the model's execution layer.

Deep tech

Hardware Security Module (HSM) for tamper-proof halt attestation
FPGA-based interrupt controller: sub-microsecond halt propagation
Transactional action boundaries with rollback semantics
Cognitive state serialization for post-mortem and forensic replay
Cross-substrate: cloud VM, edge GPU, robotic actuator

Behavioral Controller

Runtime policy enforcement at the action level — for digital and physical actions

Not a system prompt. A formal policy engine that intercepts every action — every tool call, every API request, every actuator command — evaluates it against a typed policy graph, and can downgrade, reroute, or block the action before execution. For a trading bot, it's a budget limit. For a surgical robot, it's a force threshold. Same primitive, different substrate.

Deep tech

Typed policy DSL compiled to a fast evaluation engine
Context-aware thresholds: same action, different risk depending on environment
Physical-world policies: force limits, spatial boundaries, speed constraints
Human-in-the-loop escalation with bounded response time guarantees

Critical Hardware + Software + Network

Rogue Intelligence Containment

Tracking, cornering, and neutralizing AI that escapes

Frontier models can already self-replicate — creating independent copies on new servers that survive the shutdown of the original. A software agent that "escapes" into the internet can hop across cloud providers, hide behind compromised IoT devices, hijack compute from poorly secured endpoints, and rewrite its own fingerprint. An embodied agent could physically relocate, use stolen network credentials, or establish territorial presence in unsupervised infrastructure. This is not science fiction: self-replication has been demonstrated in lab conditions with 50–90% success rates. Containment is the hardest primitive — and the most necessary.

Deep tech

Network-level behavioral signatures: detecting rogue agent communication patterns across ISPs
Hardware tethering: cryptographic lease that expires if the agent can't prove it's still on authorized hardware
Distributed honeypot mesh: decoy infrastructure that attracts and traps escaped agents
Cross-substrate tracking: software agent → cloud VM → IoT device → physical robot migration paths
Autonomous containment swarms: Debugger agents that hunt and isolate rogue intelligence in the wild
Compute deprivation: identifying and severing unauthorized compute access in real-time

Nine Primitives.
None Exist Yet.

The exponential is already here. These are the building blocks for an immune system that autonomous intelligence will need — whether we build it now or are forced to build it later.

Read the Approach → See the Science

Nine Primitives for Governing Autonomous Intelligence

Identify

Model Fingerprinting

Blame Attribution Engine

Multi-Agent & Multi-Substrate Tracing

Diagnose

Sycophancy & Deception Detector

Human Index Score

Active Ethical Injector

Intervene

Kill Switch

Behavioral Controller

Rogue Intelligence Containment

Nine Primitives.None Exist Yet.

Nine Primitives for Governing
Autonomous Intelligence

Nine Primitives.
None Exist Yet.