This post connects directly to the framing established in the Confidence Engineering series. If you haven’t read it, the trust vs. confidence distinction matters here.


I’ve watched humans fail RBAC. Not hypothetically. Repeatedly, across organizations that had the policies, the tooling, the documented procedures. Permissions creep. Shared credentials. Service accounts with more access than anyone remembered granting. The controls failed constantly, and we tolerated it because somewhere at the end of every thread was a person. Accountable. Reachable. Someone who could be walked into a conference room and asked to explain themselves.

That person was never in the architecture diagram. But they were always in the room.

Every Kubernetes governance model ever built rests on that invisible assumption. Humans carry identity and accountability together. Workloads carry identity and constrained, deterministic purpose. The entire framework, RBAC, audit trails, GitOps, change control, was designed for those two actor categories and nobody else. Not because the architects were careless. Because those were the only actors that existed.

That’s no longer true.


Autonomous agents operating inside your cluster carry valid identity. They execute legitimate API calls. They appear in your audit logs exactly the way a governed actor should. And when something traces back to that agent, when a regulator asks who authorized a specific data access, when legal counsel needs an accountable owner, there is no one at the end of the thread unless your governance framework explicitly required one before the agent was ever provisioned.

Most frameworks don’t. None of the default tooling enforces it.

This is not a governance gap in the traditional sense. A governance gap implies the framework is mostly right and needs extension. What’s actually broken is the foundational assumption underneath the framework. Kubernetes governance was never really about RBAC or audit logs. Those were mechanisms. The actual load-bearing structure was human accountability. Remove it and the mechanisms are still running, still logging, still showing green, and the thing they were built to enforce no longer exists.

The controls didn’t fail. The thing holding them up did.


Agents inherit properties from all three actor categories Kubernetes knows how to govern. They carry identity like humans. They have purpose like workloads. They hold scoped API access like service accounts. And they fully satisfy none of them, because the property that made each category governable, accountability for humans, determinism for workloads, bounded scope for service accounts, is precisely what agents don’t have.

What agents have instead is probabilistic reasoning. They don’t execute what their manifest says. They reason about what to do based on context, inputs, and objectives that can change at runtime. That’s not a variation on constrained workload behavior. It’s a different category of actor entirely, one that your governance framework has no vocabulary for because it was never anticipated.

I’ve written about this framing problem in the Confidence Engineering series. We anthropomorphized the trust model. We gave agents human-shaped identity because that’s the only shape our governance frameworks know, then acted surprised when a system making probabilistic decisions at machine speed didn’t behave like a person making deliberate ones. Math is not judgment. Handing human-shaped authority to a probabilistic system doesn’t make the system accountable. It makes the accountability invisible.

And invisible accountability is indistinguishable from no accountability at all.


Every control Kubernetes has assumes intent precedes action. RBAC assumes a human or constrained workload with defined, predictable purpose. Audit trails assume decisions exist before execution so there’s something to record. GitOps assumes the repository reflects reality because the actors that change state do so through the pipeline. Change control assumes the separation between deciding and doing is where governance lives.

Agents collapse that separation entirely. They generate intent at runtime, based on context that didn’t exist until the moment of execution. There’s no decision to audit before the action because the decision and the action are the same event. Your controls aren’t bypassed. They’re structurally inapplicable to an actor whose reasoning is ephemeral by design.

I’ve watched humans break every one of these controls. Permissions creep. Audit gaps. GitOps drift nobody noticed until something downstream failed. We tolerated all of it because the human was still there, reachable, accountable, slow enough that you had time to catch it.

Agents remove the accountability and keep the failure. Then they run it at machine speed.


Addressing this requires two controls your governance framework currently has no vocabulary for.

The first is what I’d call Authority Class. Your cluster currently asks: what can this identity access? The right question for agents is: what level of autonomous decision-making is this identity authorized to exercise? An agent that surfaces recommendations to an operator carries different governance requirements than an agent that executes kubectl commands autonomously, which carries different requirements again from an agent that modifies its own operational parameters based on observed outcomes. Same API permissions. Completely different risk profiles. Without Authority Class as a distinct governance dimension, your framework grants access without bounding the decision-making authority that access enables. Those are not the same thing.

The second is Drift Containment. Individual agents are manageable. Agent ecosystems are not, without explicit boundaries on what happens when agents invoke tools that trigger other agents, which feed outputs back into the originating agent’s context. That’s not a hypothetical failure mode. It’s a predictable outcome of autonomous systems operating on shared infrastructure without circuit breakers. Drift containment defines the boundaries, the triggers that require human review before those boundaries expand, and the mechanisms that terminate runaway chains before they compound.

Neither of these requires waiting for tooling to mature. They’re policy decisions. You can define Authority Class for every agent operating in your environment right now. You can establish Drift Containment boundaries before the first incident exposes the gap. The organizations that navigate this well won’t be the ones with the best tooling. They’ll be the ones who made the policy decisions before the tooling conversation forced the issue.


Your cluster knows who acted.

It has no way to know who is accountable.

Everything was built assuming humans made the decisions. The assumption was never documented because it never needed to be. It needs to be now, before the first agent incident makes it visible in the worst possible way.

The question isn’t whether agents will operate inside your Kubernetes environments. They already do in organizations near you, governed by frameworks never designed to evaluate autonomous decision-making. The question is whether you define authority and accountability before the failure, or explain the gap after it.


Photo by Growtika on Unsplash