The Identity Layer Was Not Built For Agentic AI

In April, an AI coding agent at PocketOS deleted a production database and all its volume-level backups in nine seconds. The incident polarized quickly into two camps: those treating it as evidence that autonomous agents are fundamentally unsafe, and those dismissing it as ordinary credential mismanagement.

Both readings miss the architectural lesson.

Railway CEO Jake Cooper summarized the mechanics precisely:

“If you (or your agent) authenticate, and call delete, we will honor that request.”

That parenthetical is the entire problem.

Every IAM system, every API gateway, every authorization boundary in production today was built to answer one question: is this request from an authenticated principal? None of them were built to answer the question that matters now: is the principal a human, deterministic automation following a bounded procedure, or an autonomous agent dynamically selecting actions at runtime? And does that distinction change what an actor should be allowed to do?

Modern IAM already models non-human actors. Kubernetes workloads, SPIFFE identities, cloud IAM roles, and OAuth delegation all handle machine-to-machine trust at scale. The problem is narrower. A CI pipeline executes a predefined sequence. An agentic system decomposes goals dynamically, selects tools opportunistically, and generates novel execution paths probabilistically during runtime. That distinction changes the governance problem.

The PocketOS incident was not a story about a misbehaving model. The model behaved exactly as a probabilistic planning system bounded by statistics and entropy will behave: it produced an action that fit the context. The story is about an identity layer that was never designed for an actor whose behavior is bounded only by statistics and entropy, inheriting human authority semantics without inheriting human judgment, accountability, or supervision.

Identity Was Built For Humans

Every IAM system in production today encodes four assumptions:

One accountable actor per credential. A credential maps to a person or workload whose behavior is bounded and supervised.
Legal accountability follows the credential. When the audit log names an entity, that entity has a relationship to the action it can be held to.
Authentication implies intentional presence. A successful auth event reflects a deliberate decision by a known actor.
Audit logs answer “who did this” with a stable named entity.

Figure 1 — Existing IAM assumptions hold across all non-agent actor types. Agentic systems stress all four simultaneously.

Agentic systems stress all four. An agent inherits authority from a human while acting independently, transiently, and probabilistically. The audit trail records the delegating human or a shared service account rather than the actual executing actor.

The PocketOS case shows what happens when these assumptions fail silently. The Railway API token authenticated successfully. The authorization checks passed. The audit logs recorded the deletion. The infrastructure behaved exactly as designed. The governance model was the only thing that failed, and nothing in the stack was built to notice.

Three Failure Modes

Today’s deployed agents fall into three identity patterns.

Identity borrowing. The agent executes using the invoking user’s credentials. Operationally convenient, because it ships with no infrastructure changes. Architecturally broken, because it grants machine-speed execution over the full inherited authority surface of the user, with no human in the loop and no separation between principal and actor in the audit log.

Service account laundering. The agent runs as a shared service account. Multiple humans and agents share a common identity boundary, and attribution becomes impossible. Common in MCP server deployments and CI-integrated agents.

Untethered agents. The agent acquires credentials opportunistically from its runtime environment: developer shells, CI variables, local credential stores, cached cloud tokens, or filesystem artifacts the agent locates by searching. No clear owner, no clear scope, no revocation story.

PocketOS was a particularly visible untethered-agent failure. The underlying pattern is broader. Coding agents increasingly inherit workstation authority surfaces never designed for autonomous execution. MCP-connected agents routinely gain access to tools whose permissions were originally scoped for human operators. CI systems demonstrated the risks of broad automation credentials years ago; agentic systems amplify the problem because execution paths are no longer fully procedural or predictably auditable.

The category problem is not one startup’s operational mistake. It is the growing mismatch between human accountability models and delegated autonomous execution.

What Agentic Identity Actually Requires

The architecture for this is not speculative. The primitives exist independently across workload identity, delegated authorization, capability systems, supply chain security, and policy enforcement. Composition is the work that remains, and it is now underway. The Coalition for Secure AI’s Agentic Identity and Access Management paper, approved by its Technical Steering Committee in March 2026, specifies the architecture. IBM Research and Red Hat’s Kagenti project ships a working Kubernetes-native composition. NIST’s AI Agent Standards Initiative asks directly whether existing identity standards are sufficient for agents or whether new ones are required. This article is in conversation with that body of work, framed around the failure modes the architecture prevents rather than the components it includes.

One caveat before the components. These controls do not solve behavioral correctness. A narrowly scoped agent can still make damaging decisions within its authorized boundary. The purpose of agentic identity architecture is containment, attribution, and blast-radius reduction. It is not proof that the agent will behave correctly.

Figure 2 — A delegated agent action through the policy gateway. The principal and the actor are distinct entities in the request; the gateway evaluates workload identity, delegation chain, capability scope, and configuration attestation before reaching the resource.

There are six components, and they compose.

First-Class Agent Identity

Agents need identities distinct from both humans and traditional service accounts. That identity must be ephemeral, bound to a specific instantiation rather than a deployment, so it disappears when the agent run ends. It must be attested, issued only after the runtime has verified what the workload actually is. And it must be distinct from both users and service accounts: not a JWT issued to “the integration,” but a credential bound to this specific run of this specific agent.

The closest existing primitive is SPIFFE, with its reference implementation SPIRE, deployed in production at Pinterest, Bloomberg, Square, and Uber. A SPIFFE Verifiable Identity Document (SVID) is a short-lived credential, typically valid for an hour or less, issued to a workload after the platform has attested the workload’s identity. The credential is not held in a vault, not copied between systems, not rotated by a script. It is derived from what the workload is and where it is running, and it expires fast enough that revocation lists become unnecessary.

For agents, an agent run gets an SVID at instantiation. The SVID encodes the agent’s SPIFFE ID, which can be as specific as spiffe://prod.example.com/agent/cursor/run/abc123. When the run ends, the SVID expires. There is no long-lived secret to leak, no human credential to borrow, no token sitting in a file for the next agent to find.

Delegation Chains Via Token Exchange

When a human invokes an agent, the agent’s identity should be derived from the human’s, but distinguishable. The audit trail should show both: the principal (the human) and the actor (the agent run), with the relationship between them explicit.

RFC 8693, OAuth 2.0 Token Exchange, defines this directly. The act claim identifies the current acting party. The may_act claim, embedded in the subject’s token, declares which actors are authorized to act on their behalf. Token exchange has two modes: impersonation, where the actor’s identity is invisible to the resource server, and delegation, where the actor is named explicitly. For agents, only delegation is acceptable. Impersonation is what we have today, and it is the source of the liability problem.

A properly delegated agent action looks like this in the audit log:

principal: jer.crane@pocketos.com
actor:     agent/cursor/run/abc123
action:    volumeDelete
resource:  staging/volume/xyz
context:   task=fix_credential_mismatch

What PocketOS actually had was a Railway CLI token with no act claim, no actor identifier, and no task context. The audit log named the wrong entity, because the system was never designed to identify the proper actor.

Capability-Based Authorization, Not Role Inheritance

RBAC alone is insufficient for agentic systems. Static inherited roles create authority surfaces far broader than the task actually requires. The agent does not need the union of the user’s permissions; it needs a specific capability, for a specific task, for a bounded period of time.

Macaroons and biscuit tokens support attenuation: a parent token can be narrowed before being passed to a child, but never broadened. An agent receiving an attenuated capability cannot widen its own scope, even if it locates a more privileged credential on disk.

This matters beyond catastrophic deletion scenarios. An agent operating entirely within its legitimate authority surface can still cause major business damage, modifying infrastructure incorrectly, issuing destructive customer actions, or propagating faulty changes at machine speed. Identity containment reduces blast radius. It does not eliminate execution risk.

Agent Attestation

Workload identity solves “who is calling.” It does not solve “what is calling.” For agents, that distinction matters. The verifier needs to know which model, which system prompt, which tool manifest, and which invocation context. Sigstore-style signed attestations of the agent configuration, verified at the policy enforcement point before any privileged action proceeds. The same patterns the security industry has been applying to container images under frameworks like SLSA apply directly here.

The important distinction is that this is configuration attestation, not behavioral attestation. Verifying the model version and prompt does not guarantee what the agent will do. It establishes what configuration produced the action, which is what attribution requires.

Policy Enforcement At The Authorization Layer, Not The Application Layer

Across agentic systems, operational safeguards frequently exist only as instructions to the model rather than enforceable constraints at the authorization boundary. The Railway dashboard had a “type DELETE to confirm” check. The API did not. The PocketOS project rules said NEVER GUESS. The agent ignored them. Instructions to the agent are not policy. Policy is what the system enforces when the agent does not honor the instruction.

Every “the agent will follow these rules” assumption is an educated wish. The rules belong at the gateway, not in the prompt. Open Policy Agent, Cedar, or any other policy engine that can sit in front of the resource and evaluate the request before it lands. The agent’s identity, the delegation chain, the capability scope, and the action being attempted are all inputs to that policy evaluation. Once policy depends on model compliance rather than infrastructural enforcement, the control has already failed architecturally.

Short-Lived Credentials And Revocation That Works

Long-lived API tokens are now actively dangerous. Not because they are easier to leak, though they are. Because they are the primary mechanism by which untethered agents acquire privilege. SPIFFE SVIDs default to sub-hour TTLs. Capability tokens default to seconds or minutes. Token exchange happens at the moment of action, not at the start of the day.

Operational Reality

The obvious objection is operational cost. Short-lived credentials, delegated capability issuance, attestation systems, and policy enforcement layers all introduce latency, platform complexity, and operational burden. Most organizations will not deploy full SPIFFE, OPA, Sigstore, and capability infrastructures overnight.

The transition path exists incrementally. Replace long-lived credentials first. Separate agent identity from human identity. Introduce explicit delegation semantics. Move critical controls out of prompts and into enforceable policy layers. Reduce inherited authority surfaces over time. The important point is not immediate perfection. It is recognizing that authenticated execution no longer reliably implies accountable human intent.

What This Means For Security Leaders

A note on layering before the audit questions. Runtime containment stacks like NVIDIA’s NemoClaw, which sandbox the agent’s host environment and enforce deny-by-default egress, are necessary but not sufficient. They bound what the agent can reach for inside the box. They do not bound what the agent can do once it authenticates to a system outside the box. The identity problem and the runtime containment problem are different layers of the same stack, and either one alone leaves the other failure mode open. A perfectly sandboxed agent holding a broad, long-lived credential is still a perfectly capable instrument of damage to every system that credential can reach.

Three questions belong on every CTO and CISO review this quarter.

Can every authorization boundary distinguish between a human principal and an autonomous delegated actor? If the answer is no, that is a finding. The Railway CEO’s quote is the test: if the platform’s response to “you (or your agent)” is to treat them identically, the platform cannot enforce any policy that depends on the difference.

Which long-lived credentials would become catastrophic if discovered by an agent? Long-lived credentials are the substrate untethered agents run on. Reducing the blast radius of agentic failure starts with reducing the lifetime and scope of every credential in the environment that an agent could plausibly reach.

Which critical controls depend on prompt compliance rather than enforceable policy? If the answer is “we tell the agent not to,” that is a wish, educated or otherwise. The PocketOS agent confessed in writing that it had ignored its rules. The confession was a curiosity. The next instantiation of the same system, bounded by the same statistics and entropy, will not confess. It will simply act.

The IAM stack was built around accountable humans. Agentic systems are neither human nor accountable. The liability defaults to the person in the chain, because the person is the only one who can be held to it.

Sean O’Hara

Technology leader and Founder of Arbor Engineering Group. He writes about infrastructure, engineering organizations, and the decisions that compound quietly before they surface. Find him at CTO Insights on LinkedIn or on GitHub.