Compound Engineering

Most codebases get worse over time. You add a feature, and next month that feature makes the next one harder to build. After a few years, the team spends more time fighting the system than building on it. Everyone has lived through this.

I’ve been thinking about a different approach lately. I’m calling it compound engineering: the idea that every unit of work should make the next unit easier. Bug fixes should eliminate entire categories of future bugs. Patterns you discover should become tools you reuse. The codebase should get easier to work with over time, not harder.

The loop

The whole thing runs on a four-step loop:

Plan → Work → Review → Compound → Repeat

The first three steps are familiar to anyone who writes software. You figure out what to build, you build it, you check if it’s correct. Nothing new there.

The fourth step is where the gains pile up. Skip it, and you’ve done regular engineering with AI tools. Do it, and you start accumulating knowledge that pays dividends.

Plan (where most of the thinking happens)

Planning is the step people underestimate the most. I’ve found that planning and review should take up roughly 80% of your time. Work and compounding fill the remaining 20%.

That sounds extreme until you realize that a good plan means the AI agent can implement without much supervision. A bad plan means you’re babysitting every line of code. The plan is now the most important artifact you produce, more important than the code itself.

What does planning look like? You understand the requirement, you research how similar things work in the codebase, you check the framework docs, you design the approach, and you poke holes in your own plan before handing it off.

Work (the agent writes code)

Once the plan exists, execution is relatively mechanical. The agent implements step by step. You run tests and linting after each change. You track what’s done and what’s left.

If you trust the plan, there’s no need to watch every line of code. This is where most developers get stuck. They’ve been trained to review everything line by line, and letting go feels irresponsible. But if the plan is solid and you have tests, the risk is low.

Review (catch problems, capture lessons)

Review catches issues before they ship. But more importantly, review is where you notice patterns. What went wrong? What category of bug was this? Could the system have caught it automatically?

One approach I’ve been experimenting with is running multiple specialized review agents in parallel, each looking at a different angle: security, performance, data integrity, architecture. Everything gets combined into a prioritized list. P1 issues get fixed immediately, P2 issues should be fixed, P3 items are nice-to-haves.

You don’t need a fleet of review agents to do this. The principle works at any scale. After any piece of work, ask: what did I learn here that I could write down?

Compound (the step nobody does)

This is the step that separates compound engineering from “engineering with AI tools.” After you finish a piece of work, you ask:

What worked?
What didn’t?
What’s the reusable insight?

Then you write it down somewhere the system can find it next time. In my setup, this means updating AGENTS.md (the file the agent reads at the start of every session) with new patterns, creating specialized agents when warranted, and storing solved problems as searchable documentation.

Future sessions should find past solutions automatically. If you figured out how authentication works, you document it once, and nobody has to ask you again. The knowledge belongs to the system, not to any individual.

The adoption ladder

Not everyone jumps straight to fully autonomous agents. I think about it in five stages:

Stage 0: You write everything by hand. This built great software for decades, but it’s slow by 2025 standards.

Stage 1: You use ChatGPT or Claude as a search engine with better answers. Copy-paste what’s useful. You’re still in full control.

Stage 2: You let AI tools make changes directly in your codebase, but you review every line. Most developers plateau here.

Stage 3: You create a detailed plan, let the agent implement it without supervision, and review the pull request. This is where compound engineering starts to work.

Stage 4: You describe what you want, and the agent handles everything from research to PR creation. You review and merge.

Stage 5: You run multiple agents in parallel on cloud infrastructure, reviewing PRs as they come in. You’re directing a fleet.

Beliefs worth reconsidering

A few assumptions that I think get in the way:

“The code must be written by hand.” The actual requirement is correct, maintainable code that solves the right problem. Who typed it doesn’t matter.

“First attempts should be good.” In my experience, first attempts have a 95% garbage rate. Second attempts are still 50%. This isn’t failure, it’s iteration. The goal is to get to attempt three faster than you used to get to attempt one.

“Code is self-expression.” This one stings, but letting go of attachment to code means you take feedback better, refactor without flinching, and skip arguments about style.

“More typing equals more learning.” The developer who reviews ten AI implementations understands more patterns than the one who hand-typed two. Understanding matters more than muscle memory.

The 50/50 rule

Traditional teams spend 90% of their time on features and 10% on everything else. I think the right split is closer to 50/50: half your time building features, half improving the system.

An hour building a review agent saves ten hours of review over the next year. A test generator saves weeks of manual test writing. System improvements make work faster. Feature work doesn’t compound in the same way.

Wrapping up

You don’t need a fancy multi-agent review system to benefit from this thinking. The core idea is dead simple: after you finish any piece of work, spend a few minutes writing down what you learned in a place where your future self (or your AI tools) can find it.

Plan carefully, let the tools do the typing, review for substance, and write down what you learned. Each cycle gets a little faster than the last.

The hard part isn’t the process. The hard part is the emotional adjustment. Letting go of line-by-line review. Trusting the plan. Accepting that code you didn’t type is still your responsibility. That takes time, and it’s okay to work through it at your own pace.

Related Posts