Discipline Is the Point
The prevailing narrative goes something like this: agents are getting better, fast. Soon they'll plan, execute, test, and deploy. The engineer becomes an observer — maybe a light editor — and the system runs itself. Just give the agent the goal and get out of the way.
It's Pollyanna-ish. And I think it's wrong.
I've been running an agentic transition at scale for the past month — a real engineering organization, real codebase, real customers. Not a side project, not a demo, not a weekend experiment. Engineers directing Claude Code to produce work across multiple parallel workflows, with tooling we've built to manage the flow of code into production. I've seen what works and what breaks.
Agents Are Tools, Not Actors
Agents do not produce good code on their own. They make mistakes constantly. They'll generate something that compiles, passes tests, and looks reasonable on first glance — and then you look closer and find naming inconsistencies, architectural drift, subtle misunderstandings of intent, or decisions that make sense in isolation but don't compose well with the rest of the system.
You can get an agent to produce something. But I'm concerned about the long-term sustainability of what agents produce when humans aren't actively managing the output. The first PR looks fine. The tenth starts showing seams. The hundredth, if nobody's been minding the patterns and the architecture, gives you a codebase that works but that nobody can reason about. What agents really accelerate isn't just technical debt — it's architectural entropy: code that works locally but loses global coherence. That's a cost nobody in the "agents do everything" camp is accounting for.
Agents are tools that humans use. Powerful tools — the most leveraged tools we've ever had in this profession. But tools that require a skilled operator. A table saw doesn't design furniture. It cuts what you tell it to cut, and if you're not paying attention, it cuts wrong.
The bottleneck in agentic development isn't code generation. It's human judgment applied at the right boundaries. That's what determines commit scope, catches architectural drift, and keeps a codebase coherent over time. Remove the judgment and you haven't automated engineering — you've just automated the production of mess.
The Discipline Tax
Agentic development doesn't reduce the need for engineering discipline. It raises the bar.
You have to plan before the agent runs. Not vaguely, specifically. What's the scope of this change? What's the commit boundary? What should the agent know about the surrounding code? If you're lazy about this, the agent will happily produce a sprawling, unfocused changeset that technically addresses the goal but is impossible to review.
You have to review what the agent produces. Really review it — not rubber-stamp it. The temptation is real: the agent wrote it, it passes tests, ship it. But that's how architectural rot sets in at machine speed. Every piece of agent output needs human eyes, and those human eyes need to be informed and critical.
You have to think about how the work gets into production, not just how it gets produced. This is the shift from personal productivity to organizational productivity — and it's the part most people skip. The code sitting in your branch doesn't matter. The PR waiting in someone's queue doesn't matter. It's all inventory, and inventory is waste. What matters is the rate at which reviewed, tested, and approved work reaches customers. Discipline means keeping that pipeline moving: small commits, small PRs, fast reviews, and constant attention to the flow.
Discipline before you run the agent. Discipline while you review its output. Discipline after the code leaves your machine — how it flows through the organization. That's the tax you pay for the leverage agents give you. Skip it, and the leverage works against you.
Who's Selling the Autonomy Dream?
Here's something I've noticed: the people most enthusiastic about full agent autonomy tend to be people who don't work in teams.
Solo contractors. Semi-retired engineers exploring the space. Non-technical founders trying to stand up a startup without an engineering team. And the evidence they cite is everywhere on LinkedIn: "I built this app in three days with Claude." Anthropic built a C compiler that compiles the Linux kernel in a week for $20K. An Amazon VP built a CMS in a weekend. These stories are constant, and they're impressive on their face.
My strong suspicion is that none of them will last over time.
To be clear: time-to-market matters. There are situations where you trade long-term stability for speed, shipping fast to learn, to capture a market, to prove a concept. I'm not arguing against that. But the real goal isn't speed or sustainability. It's both. You want to ship fast and build something that lasts. Discipline is what lets you have both. Without it, you get speed now and pain later. With it, you get speed now and a codebase you can still work with in six months.
Where one trumps the other, you let it trump. It's all situational. But you should be making that tradeoff consciously, not stumbling into it because you let an agent run unsupervised.
Nobody follows up on those LinkedIn stories six months later. Nobody asks: Can someone else maintain this? Does it handle edge cases at scale? What happens when requirements change, and you need to modify code that an agent produced in a sprint with no architectural intent behind it? The demo is the easy part. Sustainability is where software lives or dies, and it's the part of this conversation that keeps getting skipped.
In the solo context, that's fine. They are the reviewer, the architect, and the deployer. There's no one else's review queue to clog, no shared codebase to keep coherent, no organizational throughput to worry about. The code just has to work for them, right now.
That experience is real, and I'm not dismissing it. But it doesn't generalize to where most serious software is actually built: teams, shared codebases, production systems maintained by people who didn't write the original code. In that world, discipline isn't optional. It's the whole game.
The Bias I'll Own
I'll cop to something: I'm a 30-year engineer. Writing code, building systems, leading engineering teams, that's who I am. My perspective comes from running systems where failure has consequences. So take that for what it's worth.
Here's what I know: I've seen what works in getting code out the door. I've watched agents produce PRs with hundreds of commits that gum up the entire review pipeline. I've seen what happens when engineers treat agent output as trustworthy by default. I've built tooling specifically to manage the bottleneck that agent velocity creates — because the bottleneck is real, and it doesn't solve itself.
Here's what I suspect — and what organizations are about to find out the hard way: agent-produced code won't hold up over time without sustained human judgment. We haven't been at this long enough to know if I'm right. My experience is biased toward systems that have to survive change, turnover, and scale, but that's exactly the point. I look at what agents produce, and it's mediocre at best. Which makes sense, the average code out there is average at best, and that's what these models were trained on. Average code gets you running software. Average code does not make great systems. Great systems require intent, coherence, and judgment applied consistently over time. That's human work.
The solo practitioners advocating full autonomy are also speaking from experience. But it's experience in a context that doesn't have the failure modes I'm dealing with daily. Neither side is lying. But one side is operating in the environment where most professional software actually gets built.
The Bet
The industry is racing toward full agent autonomy and humans stepping back. I'm making a different bet: that the organizations that treat this as a discipline problem, not a tooling problem, not an autonomy problem, will be the ones still standing when the hype clears.
Better agents will come. I don't doubt that. But better agents don't eliminate the need for planning, review, and organizational thinking. They raise the stakes on all three. The mess you can create with a bad agent is proportional to the speed at which it operates. More capability demands more discipline, not less.
We'll know who's right. Watch the review queues. Watch the mean time to re-understand the code. Watch how many agent-produced PRs get reverted or reworked. Watch how long it takes to onboard a new engineer into an agent-built codebase. Those numbers will tell the story.
The honest truth is, this is discipline we should have always had. We got away without it because things moved slowly enough to course-correct. Ship a sloppy PR, notice the problems downstream, and fix them before they compound. That margin is gone. When agents produce code at speed, sloppiness compounds at speed.
We're in the early days of a real transformation in how software gets built. The people who get it right won't be the ones who figured out how to remove humans from the process. They'll be the ones who figured out what humans need to do differently now that the process has changed.
Discipline is the point. Everything else follows from it.
Agentic Development Operating Rules
- Humans define clean commit boundaries.
- No PR merges without a human who can explain the design.
- Review throughput caps agent throughput.
- Architectural patterns are documented before agents touch the code.
- Agent output is guilty until proven coherent.