AI Governing AI: Turtles All The Way Down

Leo Cullen
Mar 1
3 min read

Updated: Mar 2

Many suggest that given the speed, scale, and complexity of AI-initiated action, only further AI is capable of effectively governing future systems. The argument sounds pragmatic. If machines act at machine speed, only machines can supervise at machine speed.

And yet, this position gives pause for thought (and concern). Because once we say that AI must govern AI, we must ask a simple question: where does authority terminate?

If the answer is “in another model”, and then another, … we are in danger of building governance on an infinite regress. Turtles all the way down.

The Seductive Logic of Recursive Governance

The case for AI governing AI is not irrational.

Modern systems:

Execute transactions in milliseconds
Optimise portfolios continuously
Trigger communications autonomously
Adjust infrastructure in real time

Human review at that speed is impossible. So, the proposal emerges: deploy supervisory AI to monitor operational AI. Then deploy meta-monitoring AI to supervise that layer.

Technically elegant. Conceptually unstable.

Because if every layer is adaptive, evolving, and probabilistic, then governance becomes another statistical process. Statistical processes can optimise within boundaries, but they cannot define the legitimacy of those boundaries.

Optimisation Is Not Governance

AI is extraordinarily good at:

Exploring policy trade-offs
Simulating risk scenarios
Stress-testing constraints
Detecting anomalies
Identifying policy drift

But governance is not optimisation.

Governance answers different questions:

What level of harm is tolerable?
Who bears liability?
What trade-offs are morally or politically acceptable?
Who has standing to authorise action?

These decisions require legitimacy, not just computational coherence.

An AI can suggest the most efficient constraint architecture, but it cannot legitimately decide which harms a society or institution is willing to bear.

If it does, we have quietly transferred political authority into a system that should not hold it.

Where the Regress Begins

The regress occurs when:

1. A model evaluates another model’s outputs

2. A supervisory model determines whether risk thresholds are exceeded

3. A meta-model recalibrates those thresholds dynamically

4. The system updates itself without explicit re-authorisation

Each layer appears to strengthen governance. In reality, authority drifts upward into increasingly abstract mechanisms.

If no layer terminates in a named human institution with binding authority and liability, governance has become recursive pattern management rather than accountable oversight.

Probability supervising probability. Turtles.

The Clean Termination Point

A stable architecture requires a clear separation: AI may assist in designing and operating constraint systems. Humans must authorise and remain accountable for those constraints. Enforcement must be technically binding and externally verifiable.

This means:

Constraints are formally expressed and versioned
Execution is gated by runtime enforcement mechanisms the model cannot override
Changes to mandate require explicit approval
Audit trails capture who authorised what, when, and why

In this structure:

AI proposes. Humans authorise. Systems enforce.

The chain terminates in institutional accountability, not recursive inference.

Bounded Autonomy, Not Recursive Autonomy

The right objective is not “AI governing AI”.

It is bounded autonomy.

Autonomy within:

Defined mandate scope
Explicit risk thresholds
Escalation triggers
Revocation capability
Named responsibility

Supervisory AI can monitor, flag, and escalate. But it must not be the final arbiter of its own authority.

Otherwise, when something goes wrong, there is no accountable endpoint - only system logs and layered models.

Governance without a terminating authority is not governance. It is orchestration.

Governance V Stacked turtles?

The debate should not be framed as: Can AI govern AI?

The better question is: Can we design governance architectures where AI operates at speed, but authority remains anchored in explicit, accountable mandate?

Speed and scale are technical challenges. Legitimacy is a governance challenge.

Only one of those can be solved with more computation while the other requires clarity about where authority ends and accountability resides.

Without that clarity, the system may function brilliantly - until the moment it fails. And when it fails, we will discover whether we built governance or simply stacked turtles.

AI Governing AI: Turtles All The Way Down

Recent Posts

Comments