top of page

AI Governing AI: Turtles All The Way Down

Updated: Mar 2

Many suggest that given the speed, scale, and complexity of AI-initiated action, only further AI is capable of effectively governing future systems. The argument sounds pragmatic. If machines act at machine speed, only machines can supervise at machine speed.


And yet, this position gives pause for thought (and concern). Because once we say that AI must govern AI, we must ask a simple question: where does authority terminate?

If the answer is “in another model”, and then another, … we are in danger of building governance on an infinite regress. Turtles all the way down.



The Seductive Logic of Recursive Governance


The case for AI governing AI is not irrational.


Modern systems:


  • Execute transactions in milliseconds

  • Optimise portfolios continuously

  • Trigger communications autonomously

  • Adjust infrastructure in real time


Human review at that speed is impossible. So, the proposal emerges: deploy supervisory AI to monitor operational AI. Then deploy meta-monitoring AI to supervise that layer.


Technically elegant. Conceptually unstable.


Because if every layer is adaptive, evolving, and probabilistic, then governance becomes another statistical process. Statistical processes can optimise within boundaries, but they cannot define the legitimacy of those boundaries.


Optimisation Is Not Governance


AI is extraordinarily good at:


  • Exploring policy trade-offs

  • Simulating risk scenarios

  • Stress-testing constraints

  • Detecting anomalies

  • Identifying policy drift


But governance is not optimisation.

Governance answers different questions:


  • What level of harm is tolerable?

  • Who bears liability?

  • What trade-offs are morally or politically acceptable?

  • Who has standing to authorise action?


These decisions require legitimacy, not just computational coherence.


An AI can suggest the most efficient constraint architecture, but it cannot legitimately decide which harms a society or institution is willing to bear.


If it does, we have quietly transferred political authority into a system that should not hold it.


Where the Regress Begins


The regress occurs when:


1.      A model evaluates another model’s outputs

2.      A supervisory model determines whether risk thresholds are exceeded

3.      A meta-model recalibrates those thresholds dynamically

4.      The system updates itself without explicit re-authorisation


Each layer appears to strengthen governance. In reality, authority drifts upward into increasingly abstract mechanisms.


If no layer terminates in a named human institution with binding authority and liability, governance has become recursive pattern management rather than accountable oversight.


Probability supervising probability. Turtles.


The Clean Termination Point


A stable architecture requires a clear separation: AI may assist in designing and operating constraint systems. Humans must authorise and remain accountable for those constraints. Enforcement must be technically binding and externally verifiable.


This means:


  • Constraints are formally expressed and versioned

  • Execution is gated by runtime enforcement mechanisms the model cannot override

  • Changes to mandate require explicit approval

  • Audit trails capture who authorised what, when, and why


In this structure:

AI proposes. Humans authorise. Systems enforce.


The chain terminates in institutional accountability, not recursive inference.


Bounded Autonomy, Not Recursive Autonomy


The right objective is not “AI governing AI”.

It is bounded autonomy.


Autonomy within:

  • Defined mandate scope

  • Explicit risk thresholds

  • Escalation triggers

  • Revocation capability

  • Named responsibility


Supervisory AI can monitor, flag, and escalate. But it must not be the final arbiter of its own authority.


Otherwise, when something goes wrong, there is no accountable endpoint - only system logs and layered models.


Governance without a terminating authority is not governance. It is orchestration.


Governance V Stacked turtles?


The debate should not be framed as: Can AI govern AI?


The better question is: Can we design governance architectures where AI operates at speed, but authority remains anchored in explicit, accountable mandate?


Speed and scale are technical challenges. Legitimacy is a governance challenge.

Only one of those can be solved with more computation while the other requires clarity about where authority ends and accountability resides.


Without that clarity, the system may function brilliantly - until the moment it fails. And when it fails, we will discover whether we built governance or simply stacked turtles.

 

 

 
 
 

Comments


bottom of page