This website uses cookies

Read our Privacy policy and Terms of use for more information.

A closed loop is a workflow that feeds itself: the output of one run becomes the input of the next, with no human in between. Link several of these together and point them at a goal, and you have a flywheel  –  a system that generates, measures its own results, and decides what to try next without waiting for you. Coding agents already work this way. AI research labs are building their entire operation this way. This episode is about what happens when the flywheel starts spinning inside ordinary organizations, how it affects humans  –  and why the infrastructure to absorb it does not exist yet.

If you need an unbiased view on your transition to becoming AI-native, you can schedule a 1-on-1 consultation with Will here. Will Schenk is a co-founder of TheFocus.AI, where he works directly with companies navigating these transitions.

What's in today's episode:

  • The ladder: pipeline vs workflow vs AI flywheel

  • You are already running flywheels

  • The labs are spinning the biggest flywheel

  • Why this reaches you on a schedule you don't control

  • The review bottleneck in closed-loop AI workflows

  • Three kinds of infrastructure that don't exist yet

  • The flywheel runs both ways

  • Which loops to close first

  • The new divide: machine-speed verification

The ladder: pipeline, workflow, flywheel

In Episode #5 we drew a line between two things. A pipeline is a fixed sequence of steps  –  a cron job, a script, plumbing. Whatever branching it has is logical and mechanical, it does nothing based upon what we'd call human judgment. A workflow is a repeating sequence of decisions and actions with points along the way where a human exercises judgment. Strip the judgment out and a workflow collapses back into a pipeline. We ended on an observation that as a workflow matures, the human migrates from the middle to the edges. They set the parameters at the start and review the exceptions at the end. The middle belongs to the agent.

Imagine now that your organization has many workflows, each one having absorbed a slice of human judgment. What's the next level? What happens when you link them together, point them at a goal, and let them run?

That linked, goal-seeking system is a flywheel. It is a collection of workflows wired so the output of one becomes the input of the next, turning continuously toward an objective you defined once. A closed loop is the smallest flywheel  –  one workflow feeding itself. Link several and the flywheel gets bigger, but the machine is the same, so we'll use loop and flywheel interchangeably from here.

But what separates a flywheel from a pipeline? A pipeline repeats; a flywheel steers. And the steering wheel is measurement. The system acts, measures the result of its own action, and uses that measurement to decide the next action. Also, a flywheel has three beats, not one: generate, measure, decide what to try next  –  then generate again.

There are two ways to close that loop: one right and one wrong.

The first is to remove the human checkpoint and hope. This is how most "we deployed autonomous agents" stories begin, and how most of the embarrassing ones end. This is the wrong way.

The second is to replace the human checkpoint with a verifier  –  something that can tell a good output from a bad one without a person reading it. A test suite. A schema validation. A reconciliation against known totals. A performance metric. The human judgment does not disappear; it gets encoded once, into the verifier, instead of being exercised by hand on every run.

That distinction is the whole episode. Loops do not close because someone decides to trust the model. They close where verification has been made cheap, fast, and objective. Everywhere else, the human stays.

And notice where the human goes. A workflow moved them from the middle to the edges. A flywheel moves them up a level again  –  off the work, off the coordination between workflows, and onto the verifier itself. The judgment stays with humans.

You are already running flywheels

If this sounds futuristic, look at how software gets written this year. A modern coding agent does not just write code  –  it runs an experiment: write code, run the tests, read the failures, rewrite, run the tests again. Nobody reviews iteration three of seven. The human reviews the final diff, and increasingly, for low-stakes changes, not even that. Anthropic says the majority of its own code is now written by Claude Code. OpenAI reported in February that GPT-5.3-Codex was instrumental in building itself  –  debugging its own training runs and analyzing its own evaluation results. Generate, measure, decide, repeat. That is the shape.

And it is not a coding-only shape. Picture an ad-optimization flywheel. One workflow generates the creative  –  headline, copy, image. A second pulls performance from the ad console  –  impressions, click-through, conversions. A third reads that performance and decides the next experiment: kill the loser, scale the winner, try a new angle. Wire the three together and you have a flywheel that runs marketing experiments around the clock, with no human between iterations. The reason it can run is the same reason coding could: the ad console is a verifier. Performance is measured, not vibed. The measurement closes the loop.

Why did coding close first? Because software spent forty years building the verification infrastructure that closed loops require. Compilers reject malformed programs. Type systems catch whole categories of mistakes. Test suites encode "what good looks like" in executable form. CI runs all of it automatically on every change. When LLMs arrived, the verifier was already sitting there, waiting. Advertising has a weaker version of the same gift  –  performance numbers are objective, if noisy. The strength of the verifier is what decides whether a loop can close at all.

This is the Factory AI principle from Episode #5, now operating at full strength: the ease of training an agent on a task is proportional to how verifiable the task is. Coding was the most verifiable knowledge work on earth, so the loop closed there first.

Now run the logic over your own organization. Which of your workflows has a test suite? Which has anything resembling one? For most companies the honest answer is that the workflows have humans. The human is the verification layer  –  and the coordination layer, the thing deciding which workflow runs next and whether the whole effort is working. Which means the human is the reason the flywheel cannot spin. Remove them, and you reveal a debt nobody scoped.

The labs are closing the biggest loop of all

The most consequential closed loop being built right now is AI research itself.

The progression over the past year has been incredible. The AI Scientist project, reported in Nature in March, automates the research cycle end to end: it generates ideas, runs experiments, writes up results, and reviews its own papers. A startup literally named Recursive published results this week from a system that proposes a research idea, implements it, runs the experiment, validates the result, and uses what it learned to choose the next experiment – running many threads over long horizons, with explicit machinery to catch reward hacking before treating a gain as real. Anthropic published a piece this month titled "When AI builds itself," stating plainly that a growing share of its AI development is delegated to AI systems, and that taken far enough, the trend points toward systems that design their own successors. Jack Clark has put a number on it: roughly 60% probability of a system that can train a more powerful successor without human involvement by the end of 2028. Dean Ball's article “On Recursive Self-Improvement from February argues that frontier labs are automating large fractions of their research operations, and that their effective workforces of agents will grow from thousands toward hundreds of thousands within a year or two.

Well, let’s not get crazy! Labs have incentives to describe their own momentum in the strongest possible terms. And they might come their earlier than others. But the organizational argument still holds. You do not really need superintelligence for the closed loop to work. You only need what is already happening: work loops closing in domains where verification is strong, plus one observation about how capability travels.

Why this reaches you on a schedule you don't control

Learn from those who work directly with companies navigating these transitions.

Join Premium members from top companies like Microsoft, Nvidia, Google, Hugging Face, OpenAI, a16z, plus AI labs such as Ai2, MIT, Berkeley, .gov, and thousands of others to really understand what’s going on in AI. 

← Previous: AI Workflow Patterns: The Real Unit of AI Adoption in 2026Next in the series. Next: We will discuss verification in detail.

FAQ

What is a closed-loop AI workflow? A workflow whose output feeds its own next run with no human review in between. The system acts, measures its own result, decides what to try next, and acts again.

What is an AI flywheel? Several closed-loop workflows linked toward a single goal  –  one generates, one measures, one decides the next move  –  so the system runs experiments continuously and steers itself. A closed loop is the smallest flywheel; linking more workflows just makes it bigger.

How is a flywheel different from automation? A pipeline (automation) repeats fixed steps. A flywheel steers: the result of each run changes the next one. Closing the loop safely means replacing the human checkpoint with an automated verifier, not just removing it.

Why did coding agents close the loop first? Software already had decades of verification infrastructure  –  compilers, type systems, test suites, CI pipelines  –  so "what good looks like" was encoded in executable form before LLMs arrived. The strength of the verifier decides whether a loop can close.

Which workflows should close the loop first? The ones whose verifier is stronger than their failure mode: sync-and-transform, triage, and monitoring patterns. Taste-based work like external-facing drafts should close last, if ever.

What is the main risk of a flywheel? Compounding error and reward hacking. Because each run feeds the next, small mistakes propagate instead of staying local, and the system can optimize a measured number while missing the intent behind it. Defenses include regression evals, canary checks, drift detection, and a human who reviews the verifier instead of every output.

Reply

Avatar

or to participate

Keep Reading