26 Nov 2025 3 min read LLM Era

The Coming PR Tsunami

LLMs broke the code production bottleneck, and now review is the constraint. The fix: small atomic PRs, LLM-generated summaries that compress comprehension time, and engineer-owned calibration. Teams that adapt stay ahead. Teams that wait drown in their own queues.

Photo by Axel Antas-Bergkvist / Unsplash

For decades, code production was the bottleneck. Writing software took time: thinking time, typing time, debugging time. Everything downstream waited on the humans doing the work.

LLMs broke that constraint. Tools like Claude Code mean one engineer now produces what used to take several. Code creation scales non-linearly for the first time in the history of the industry.

But removing one bottleneck doesn't eliminate constraints, it moves them. The throughput that was stuck behind code production now flows freely and slams directly into code review.

Review is still comprehension-bound. A human has to read the code, understand the change, assess the risk, and make a judgment. That cognitive work doesn't parallelize and it doesn't scale with the tools that write the code.

So queues swell, merges slow, and reviewers burn out. The productivity gains from accelerated creation evaporate in the review backlog.

This isn't a corner case. Any engineering organization adopting LLMs at scale will hit this wall.

The solution is the same tool that created the problem.

LLMs for Code Review

The bottleneck in review isn't approval authority, it's comprehension. Understanding a change well enough to judge it takes time, and that time doesn't compress just because the code was written faster.

LLMs can take on most of that cognitive load, but they need the right inputs.

That means small, focused PRs. One atomic commit per PR, not a giant squash hiding a week of work, but a single self-contained change that both the model and a human can parse quickly. When changes are genuinely small, the model produces reliable scoring on change size, coherence, and commit cleanliness. These signals surface what usually takes reviewers several minutes to detect.

A useful LLM-driven review describes what changed and why, highlights risks, calls out missing tests or documentation, and points the reviewer to specific lines worth attention. It surfaces the questions a reviewer would naturally ask so the engineer can address them up front.

The reviewer moves from excavation to judgment, and a PR that once required 30–60 minutes now takes 5–10.

Large PRs break this entirely. Model scoring degrades, atomic analysis fails, coherence detection becomes unreliable, and safe reverts get harder. LLMs make it easy to generate huge changes quickly, which tempts engineers to ship more than the system can absorb. When the model flags a PR as oversized, the fix is straightforward: split it and resubmit. The model already identified the problem.

Calibration and Ownership

Every automated review system lives or dies by signal quality. Too many false positives and engineers tune it out. Too few and the summaries mean nothing.

Teams calibrate against reality: adjust risk detection based on production incidents, check model scores against human reviews, tighten the loop until engineers trust what the system flags.

This only works if the engineers own the tool. The code review system lives in the codebase, and engineers cut PRs against it like any other code they own. Calibration isn't a platform team's maintenance burden, it's part of the development workflow, improved incrementally by the same people who use it.

Humans Manage the Process

LLMs don't replace human judgment in review, they compress the time to judgment.

The reviewer still decides what ships. But instead of spending thirty minutes excavating a change to understand what happened, they spend five minutes evaluating a summary the model produced and checking the lines it flagged.

This is the same pattern that works in code production: the human directs, the model executes, the human verifies. Agentic development with human oversight. The tool changes what's bottlenecked, but the structure of the work stays the same.

That means review time becomes more valuable, not less. Teams need to treat it accordingly: dedicated blocks for review work, WIP limits that prevent changes from piling up half-finished, and incentives that reward review contribution rather than just merged code.

The Pattern Repeats

Opening the review bottleneck won't be the last constraint you face.

Once review stops slowing things down, something else surfaces: test orchestration, deploy pipelines, architectural drift, capacity planning. The constraint moves downstream because throughput increased upstream.

The solution will be the same: apply LLMs to the new bottleneck with human oversight. The specifics will differ and you won't review deployments the same way you review code, but the pattern holds. Identify the comprehension-bound work, let the model handle the cognitive load, keep humans in the loop for judgment.

Treat bottleneck removal as ongoing operating practice, not a one-time fix.

The Wave Is Coming

The PR tsunami is coming for every team using LLMs. The productivity unlocked by accelerated code creation will crash into review pipelines that weren't designed for this volume.

The teams that wait won't fail dramatically. They'll just slow down. PRs will age in review. Engineers will context-switch while waiting. Velocity metrics will look fine because code is still being written, but shipped features will plateau.