Dominic Cooney

My Agentic Coding Process, Summer 2025

Tue, 23 Sep 2025 18:53:57 +0900

Around June 2025 I had time on my hands, so I tackled what I viewed as the next set of important problems with Agentic coding. In my view, the problems were:

To make good use of my time, by having the agent run only involve me when there's a problem.
To elicit good (...good enough) performance from models, without continuous monitoring.
To avoid "collapse."

Make Good Use of My Time

Once a model is a much better engineer than you, sure, you should let it churn out source code and trust it to judge whether that code is good. 2025, models got better, more marginal engineers embraced the tools, so it follows that people increasingly handed the reins to coding agents. I'm an AI positivist prepared to do that at some point. But for strong engineers, it doesn't make sense to delegate everything to agents yet.

If the human is going to be involved, the question is: how do you use their time effectively?

For me, circa northern hemisphere summer 2025, that means getting out of the inner loop without getting out of the loop entirely. To work on the parts of the problem that complement the model's capabilities, then be free to walk away. Not waiting for the model to churn.

Elicit Good (Enough) Performance from Models

Models are steadily improving. The context systems around them are fitfully improving. But we can't throw arbitrary problems at these systems and expect good results. So what are we going to do while we wait for models to improve?

Classifying errors, then understanding and attacking their root causes, sounds dangerously like real work. Work which may be eclipsed by model improvements anyway.

Instead, we should work on this key problem: How do you pick tasks that are "just right" for your chosen model's capacity for attention and reasoning? By the way, you can err on both sides. If you pick a task that's too hard, performance is bad, and it is easy to see that. But if you pick a task that's too easy, model performance is good, but you have subtly wasted human time dealing with a relatively trivial task.

Avoid "Collapse"

By "collapse", I mean these specific failure modes:

The agent f*cks up the codebase. Progress stops.
The agent doesn't explosively f*ck things up, but it goes in circles, (break it, put it back, break it differently... often subtly degrading quality as it does so) or it gets distracted by other problems.
The agent exudes a can-do attitude and claims problems are solved, but they are not.

My Meta-Coding System

Here's how I tackled these problems. I'll start with the what... a set of prompts and a tiny bit of Python driving the Claude Code SDK in the inner loop. The prompts explain three roles:

Operator. Defines goals. Played by the human.
Engineer. Realizes goals. Played by the agent.
Compliance. Checks this process is being adhered to. A separate invocation of the agent.

The operator authors and checks in a file, GOALS.md. This is a bulleted list of requirements, in a tree structure. Requirements are named. They cover content ranging from something suitable for a product requirements document (PRD) that your product manager might produce, and technical requirements you might find in a design doc.

Here's an example of ONE goal related to an interactive visualization of a path planning system for an autonomous ship:

G-STAGE There is a 3D viewport which can be rotated using the mouse or touch and zoomed with pinch or the scrollwheel. The z=0 plane is suggested by subtle grid lines.

The view is centered on our ship in the simulation. Specifically, the camera is behind and above the ship, and is relative to the ship. This means as the ship moves and changes direction on its path, the camera makes compensatory pans and swings.

If the user reorients the camera while the simulation is playing, the simulation does not stop and there is no jank when they stop manipulating the view.

The engineer examines the codebase and the goals and picks a goal to pursue. This happens in a structured way:

At the start of a goal attempt, create a git branch named after the goal.
Solve the goal adhering to an engineering philosophy explained in the engineer's prompt. This philosophy is, roughly, the Google way of software engineering (at least in the late 2000s and early 2010s when I worked there.) That should be another blog post, but I'll briefly characterize it as objective, technical, critical, preferring cleanliness and consistency, and heavily emphasizing tests. There are some additions, though. For example, an added emphasis on making results visible to engineer and operator. This encourages practical, performance enhancing tool use like looking at screenshots and operator ergonomic improvements like storybooks and demo modes.
Using objective measures of success--typically a claim that a unit test adequately captures a requirement--compliance scrutinizes whether a goal has been achieved. It literally reads the commit message to discover what goal is being attempted, reads the relevant goal, reads the relevant test to decide if it covers the relevant goal, and can run tests and examine output. It makes a judgement based on these facts.

When a goal is achieved, the engineer continues by selecting a new goal.

When a goal is not achieved, the engineer makes a new attempt at the goal. However, critically, they do this by creating a new git branch resetting them to the departure point. (The failed work remains in git as an evolutionary dead-end. Often interesting to inspect when improving the meta-process.)

After a failed attempt, engineer may edit GOALS.md to sub-divide a goal into simpler parts. Given the context of an attempt that failed and the judgement of the compliance agent about why the effort failed, these goal decompositions turn out to be feasible.

If all goals are achieved, the project is done. Alternatively, after three successive failed attempts, the process halts and escalates to operator.

The Good Parts

I find this system effective and a better use of my time than the coding agents I left behind. There's no more constant hand-holding. Write some goals, leave it alone, and come back much later to be delighted at how much progress it has made.

It is easy to incorporate feedback by writing new goals, or changing goals.

Interestingly, there's no checklist of completed goals. This is a deliberate choice. It avoids the Polyanna problem where the agent does some clowny sh*t, adds a green tick emoji to the task list, and stops thinking about that goal. Claude is surprisingly attentive to when goals are out of alignment with the code and will deal with updated requirements where goals have been refined.

The best goals are orthogonal, or at least not openly contradictory, and the software architecture is stronger as a result. By revising goals continuously the goals document is a living document and not a historical task list. (If you want history, that's what git log is for.)

The process of resetting branches, and subdividing goals, is roughly adapted from The Mikado Method. This slays the problem of the agent always wanting to just fix one more broken thing... oh you're right, let me just change that... oh I need to just fix... and going in circles.

Similarly, three failed attempts in a row stop the agent burning tokens if it is really stuck. It prevents endlessly hopping between problems because any three consecutive failures of stop the agent. And it prevents the agent wasting time on your poorly specified goals. (If the sub-sub-goal is too hard, you need to ask for help.)

Next Steps

For focus, I deliberately left some opportunities for later. I've been thinking about them, though.

One is parallelism. My system is single threaded. There's nothing inherently single-threaded about this process, but exploiting parallelism well will benefit from effectively merging related subgoals, and possibly flagging conflicting subgoals, arising from the concurrent agents. Interesting topic but not one I needed to solve this summer. Maybe when it is colder outside.

Another is learning more effectively from mistakes. I generated this process using this process, but the approach was hacky. Doing it "live" confused the process-improving agent, which tried to incorporate some learnings from first-order projects which were too domain specific. Waiting until projects were done to reflect and make improvements to the process went better, but Claude's overzealousness to prevent any errors made it attempt to compromise the inherent flexibility of the feedback loop from failures.

(It is prosaic, but I believe simply collecting all of the tweaks over the course of several projects, and using them for classic fine tuning, could be useful to try. The hard part there is just collecting the data.)

This system burns a lot of tokens (...which was kinda the point. Unblock machine time from the human attention bottleneck.) There's plenty of easy efficiency work like caching the process documentation. Using cheaper models is appealing: High capacity models can do goal decompositions for cheaper models to attempt. (Unblock cheap inference from the expensive inference bottleneck.)

Ultimately, I found this a useful point in the solution space. One I believe avoids a lot of problems I hear other senior engineers are having. At the same time, there's a lot of juice left to be squeezed out of these models. Let's keep sharing what we find that works (and doesn't work) when using AI to invent new ways of building software. Peace and love to you all, my fellow hackers.

Chocolate Chip Cookies

Sun, 03 Jan 2021 09:55:57 +0900

This is based on the “Pittsburgh” cookie in Bayesian Optimization for a Better Dessert with even less sugar and other tweaks. That is an industrial recipe, so this version works well if you dump everything in a bowl and mix. It is a good recipe to make with kids.

Skeptical about the chilli? Divide your dough and make some cookies with and some without. The chilli makes the cookies compulsively more-ish.

Ingredients

136g butter, melted
107g flour
60g oat flour, which you can make by milling oatmeal in your blender into a powder
55g muscavado sugar, or brown sugar... or any kind of sugar
½ tsp salt, even medium coarse salt works well
½ tsp baking soda
50g egg, if you have more than 50g, use all of the yolk first
⅓ tsp vanilla essence
⅓ tsp orange essence
two dashes of cayenne pepper, or ½ tsp chilli powder 🌶️
254g chocolate chips

Method

Preheat the oven to 170°C. Line two baking trays with baking paper.
In a mixing bowl, add all ingredients except the chocolate chips. Mix well.
Add the chocolate chips and mix briefly to distribute them.
Using a tablespoon measure, scoop out crude balls and place on baking trays.
Bake until light brown, about 20 minutes.
Remove from the oven and cool for 5 minutes, then transfer to a cooling rack.

Tips

🚫 Don't introduce the egg to hot melted butter: The butter will cook the egg and you will get a mixture of cookie dough and scrambled egg. I soften butter in a 500W microwave over for one minute, which is not very hot, add the flour, then the egg.
🚫 Don't mix the butter and sugar until the sugar is dissolved. Without some sugar crystals, the surface texture of the cookie suffers. And we're already using a fine sugar.
🚫 Don't roll the cookies into smooth balls. This makes the cookie surface smooth and less crunchy. Before baking the cookies should look very rough and forlorn.

Useless Stuff

Sat, 20 Jun 2020 23:27:12 +0900

You have too much useless stuff.

Get rid of it.

Marie Kondo's method can manipulate your emotions to help you throw things out. First, you have to pile all your clothes on the bed. Put all you books on the floor. That sort of thing. How much stuff you have will disgust you. Second, you have to "wake up" your things and "thank" the ones which do not "spark joy" before you "let them go." This is sleight of hand to distract you from what you're doing: You are wasting these things. Marie approves. You have her permission.

Japan has an indigenous concept, 断捨離 (danshari), which has three parts: Don't get things you don't need. Discard things you have but don't use. Don't obsess about things. This is more comprehensive than the Kon-Mari method. The Kon-Mari method is discarding. Discarding has the negative connotation of wastefulness, もったいない (mottainai.) So Marie Kondo is less famous in Japan. People I've talked to here seem skeptical of her method.

Subscriptions. Cancel them. This is my work in progress. The newspaper, some software are still too hard to let go of. But try. You will cancel some. You will get offers to remain a subscriber at half price for others. (Read those cancellation terms.) Patreon—those people you support have moved on to other things. Or you have moved on to other things. That's OK.

Don't Get Any More Stuff

Once you have less stuff, its tempting to get new stuff. Ask yourself these questions first:¹

Can I live without this item?
Based on my financial situation, can I afford it?
Will I actually use it?
Do I have space for it?
How did I come across it in the first place?
What is my emotional state in general today?
How do I feel about buying it? And how long will this feeling last?

Examine your motivations. You don't need, can't afford, won't use, and don't have space for [thing] you stumbled upon while feeling a little bored. The nice feeling you have about buying it won't last.

Kickstarter. Getting off it takes special effort. You click, you buy, you get the little endorphin rush from that. The remorse comes much, much later—if at all. From time to time something does arrive in the post. I forgot all about this! It's a present from someone with excellent taste: Yours. Let's shop some more!

I tried to give it up cold turkey. That didn't work. Things I had already backed kept pulling me to the site. Size? Color? Address? Psst you might like this project. Only now that I'm free of it can see what's going on. Of course Kickstarter could remember my mailing address. They don't. That is the point.

What did work was doing Kickstarter "lite." Project looks interesting? Great, back it for $1. You will get updates but no rewards. This makes the site less sticky. It kills the reward signal. In a month I was clean; I didn't even need to back things for $1 any more.

Keep the box. If you do cave and buy something, keep the box. These fill up your closet space and act as a natural brake on having too much. And when you're ready to "let it go," it's better to sell it in the original box.

Minimalism: Even Less Stuff

To go deeper, you need the minimalists. Goodbye, Things: On Minimalist Living has a series of short arguments for minimalism. Read it. A few of them will stick and start to nag at you.

Minimalism is a thing I have not acquired, though.

I clipped these from an article, but threw the rest of the article out. If you know the source, please let me know so I can link to it. ↩