What the Hell Is Loop Engineering?

The Elephant on the Lid

Jun 11, 2026

The best loop engineer in my house is a rice cooker. The Japanese one, with the little elephant on the lid. (And no, this is not a sponsored post.)

I thought about it this week because my feed filled up with the same idea in different outfits: stop prompting your AI agents; design loops that prompt them. The engineer who created Claude Code says he doesn’t prompt Claude anymore. His loops do. Riding on top of it all, a brand-new phrase: loop engineering. The practice it names has been building quietly at the frontier for a year or more; the word just caught up.

Every time a new term shows up wearing a jacket two sizes too big, I try to explain it to myself in plain English. This time I didn’t have to look far. The explanation was sitting on my kitchen counter, wearing an elephant.

You Are the Loop

Start with how most of us use AI today. It looks exactly like cooking rice in a pot.

You put the rice on, and now you’re committed. You hover. You lift the lid. You poke, taste a grain, turn the flame down, set a timer in your head, come back, taste again. The rice gets cooked, sure. But let’s be clear about who did the cooking. The pot held the heat. You were the loop.

That’s prompting. You ask, read what came back, adjust, ask again, and at some point decide it’s good enough to stop. The model is the pot. You are the loop. Which is why “prompt engineering” quietly stopped being the whole job: the prompt was never the expensive part. The hovering was.

You can hear the ratchet click. First we engineered prompts, the words. Then context, what the model gets to see. Now loops, what the model does next, again and again, without us. Each layer exists because the one before it hit a wall.

The Rice Cooker Doesn’t Hover

Now watch the machine do the same job.

With my rice cooker, every decision happens before anything cooks. I pick the menu: white, brown, sushi, porridge. I pick the texture; the elephant offers five, from softer to firmer. Then I press start and walk away. There’s nothing left for me to do. The machine owns the middle. And mine is the pressure type, so I couldn’t hover if I wanted to: once it pressurizes, the lid locks. No peeking. No poking. The pot stays sealed until the machine decides the job is done.

Under that lid, it’s running the same cycle I used to run by hand: do a step, look at what actually happened, decide. Heat the pot; read the temperature; keep going or shut off. For an AI agent, swap the verbs: run the code, read the test results, then re-prompt, declare done, or wake a human.

That’s it. That’s the loop.

A small, boring program that does the hovering for you. It prompts the agent, reads what came back, and decides whether to go again. The people building these aren’t writing better prompts. They’re building rice cookers.

And the good ones pressure-cook. A sealed pot raises the boiling point past anything an open pot can reach, which is why pressure-cooked rice comes out more tender. A well-built loop does the same thing to your work. It runs every test on every pass. It rereads the whole change every time. It keeps going at 2 a.m. That’s the part people miss: the loop can do more than replace your hovering. It can out-cook you.

Loop engineering, in one line: designing the thing that watches the pot, and sets the pressure, so you don’t have to.

How Does It Know the Rice Is Done?

This is the question that separates a toy from a tool, and the rice cooker’s answer is my favorite part.

The original rice cookers had no timer. They couldn’t see the rice. They knew one fact of physics: as long as there’s water in the pot, the temperature can’t rise past boiling. The moment the water is gone — absorbed, done — the temperature spikes. The cooker feels the spike and shuts off. It doesn’t guess. It measures.

AI loops live or die on the same question. Making a loop go is easy; making it stop at the truth is the craft. The teams who’ve been at this the longest use two mechanisms, neither exotic. First, they write down what “done” means before anything runs. Anthropic’s engineers gave one long-running agent a checklist of 200-plus requirements, every one marked failing on day one; the loop’s job was to turn them green. Done is a list, written before the pot goes on. Second, the cook doesn’t grade the rice. A separate, smaller model checks after every pass and answers one question: is it actually done? The heating element heats. The sensor decides. Never the same part.

(If this sounds like test-driven development wearing a new jacket: yes. The difference is who’s reading the tests now.)

But rice comes with a boiling point. Code doesn’t. So the whole craft of loop engineering is building one: a signal so unambiguous the loop can’t talk itself past it. The physics won’t save you. You have to write the physics.

If you live in the AI world, this should look familiar: it’s how the models themselves get trained now. The labs call it reinforcement learning with verifiable rewards (RLVR). Instead of a person or another model scoring the output on feel, the reward comes from a check that can’t be argued with. The answer matches or it doesn’t. The tests pass or they don’t. AI researcher Jason Wei wrote it down as a law last year: the easier a task is to verify, the faster AI gets good at it. Rich Sutton said the quiet part back in 2001: an AI can only maintain knowledge to the extent it can verify it itself. Same principle, two altitudes. The labs use the unarguable check as the reward that trains the model. Your loop uses it as the latch that unlocks the lid.

You Already Run This Loop

If this still feels exotic, look at your own calendar.

You already run a loop. You’ve run it for years. It’s called a sprint. The goal you write at planning is the stop condition. The acceptance criteria are the checklist, red until they’re green. Velocity and burndown are the temperature readout. The review is the taste test. The retro is you adjusting the flame. Then the loop goes again.

Loop engineering is your sprint cycle with the clock speed turned up, from two weeks to two minutes, with a machine running the ceremony. Which is why I’d bet on the leaders in the room, not just the engineers: you’ve spent a career learning what a good goal, an honest metric, and a real retro look like. That’s the exact skill the loop needs. This is closer than you think, and simpler than the jacket makes it look.

Undercooked, Overcooked, Forgotten

Simpler doesn’t mean safe. Remember the trade you just made: the lid is locked, and nobody’s watching mid-run. Whatever hovering used to catch, design has to catch now. So before you trust one, it’s worth knowing how loops fail. They fail in kitchen-shaped ways.

Undercooked: a cooker that stops at twelve minutes stops at twelve minutes, ready or not. Crunchy rice. Agents do this. Anthropic documented one that came back mid-project, saw progress, and declared the whole job finished. It wasn’t.
Overcooked: no stop signal at all, and the loop keeps going past done. Confidently. At machine speed. While you sleep. Good work cooks down into mush, and nobody can uncook mush.
Forgotten: long loops lose the plot, because the model’s short-term memory fills up mid-run. The fix is humble: the loop keeps notes outside itself, a progress file, a commit log, the way you’d leave a note on the counter. Rice on, started 6:40. They put an elephant on the lid for a reason. The loop, unlike the elephant, forgets.

And let’s say the quiet parts out loud. Loops eat tokens; a loop that runs all night bills like it ran all night, and every pass rereads the kitchen. Most teams aren’t ready, and not because the tech is exotic: if nobody can write down what “done” means, the loop has nothing to check. And a loop amplifies whatever you already are. DORA’s 2025 research found that AI magnifies an organization’s existing strengths and weaknesses. A loop with weak verification just ships your dysfunction faster.

If those complaints sound familiar, they should. Continuous integration got the same reception when it showed up: too expensive, not ready, why would anyone run every test on every change. It was also the first loop we ever trusted with our code, every commit ending in a red or green verdict with no feelings attached. Nobody would build without it now. The agent loop is the same idea with a bigger appetite, and the teams that learn to write the physics early are the ones it will pay off for.

Why I’m Telling You This

I don’t write these loops, outside a couple of pet projects that exist mostly for fun. My job is to support the people who will. Most of my team isn’t writing them yet either, and the reason is simple: the prerequisite isn’t code. It’s the rice-cooker question, and it’s harder than it sounds: can you write down what “done” means, in a form a machine can check? I made this argument in The Player in Street Clothes: the bottleneck isn’t generating the code anymore. It’s knowing whether the code should ship.

One honest footnote. “Loop engineering” might not even be the name that sticks; some camps call the same thing harness engineering. The word may not survive the year. The rice cooker will.

That’s loop engineering, minus the jacket. Stop standing over the pot. Build the thing that watches it. Set the pressure higher than you could hold by hand. And spend your real effort on the only question that matters: what’s the temperature spike that proves it’s done?

So here is my question for you this week. Think about the last thing your team shipped with AI in the loop. Who decided it was done — a measurement, or a feeling?

#Leadership #AI #LoopEngineering #AIEngineering #EngineeringLeadership #FutureOfWork

My Biased Read

Discussion about this post

Ready for more?