30 Mar 2026 9 min read LLM Era

Agents Reprice Everything

We run three backend languages. That was never a goal — it's what happens when pragmatic choices accumulate over time. Nobody sat down and decided we needed Scala, Python, and TypeScript. Each one arrived for defensible reasons, and each one stayed because the cost of leaving was higher than the cost of staying.

What changed isn't that we decided to converge. What changed is that we looked at the whole stack through the lens of agentic development, and every language had a problem. They weren't the same problem.

Scala is our core domain language, and it arrived at an awkward moment carrying two independent problems. Akka, which we've relied on heavily, has gone closed source — a migration we'd face regardless of anything else. Separately, the Scala 2 to Scala 3 upgrade isn't a routine version bump; it's an invasive rewrite. Two migrations, neither optional, neither waiting for the other. When you look hard at what you're buying through all of that, the answer is uncomfortable: Scala 3 has a smaller community, a thinner training corpus, and an ecosystem that is narrowing, not widening.

Python we use for data-adjacent work and newer services, and we use it seriously — strictly-typed domain model Python, not casual scripting. But that seriousness works against you. The feedback loop spans multiple tools, the type system is gradual, and too many issues are still discovered late. The result is more iterations than you'd expect from a language being asked to support that level of rigor — and those iterations are slower. Agents don't produce this kind of Python well. The training corpus is overwhelmingly casual Python, not strictly typed domain-model Python. What we're asking for is an edge case use of the language. A necessary one, in our view, but the corpus doesn't know that.

TypeScript is where we've had the most active internal disagreement. There is a real case for it in the backend, and smart people on the team have made it. But TypeScript is still fundamentally a JavaScript ecosystem with a compile-time type system layered on top. Those types don't exist at runtime, which means correctness at system boundaries has to be re-established through separate validation or left partially unchecked. Underneath that, the Node ecosystem carries a persistent supply chain cost: deep transitive dependencies, frequent CVEs, and ongoing churn that have to be actively managed. Large organizations make this work, but they don't eliminate the risk — they absorb it, building the people, process, and tooling required to continuously manage it. Mitigations like lockfiles, vendoring, and auditing don't solve the problem; they just shift it to ongoing operational work. We've known this for years. The trade we made was explicit: the cost of avoiding that ecosystem was higher than the cost of carrying it, so we carried it.

We could have kept going. None of these problems were blockers. The stack worked, in the way that most pragmatic engineering choices work, well enough, with known costs absorbed over time.

Then the agentic world changed the math. Not just the math on languages, the math on what kind of system software development actually is.

Repriced Criteria, Repriced Problems, Repriced Solutions

This is worth stating precisely, because the naive version of the argument is too simple. Agentic development doesn't just make some problems worse. It changes the evaluation criteria, reprices the problems you already have, and reprices the solutions that were previously off the table. All three.

The shift underneath all of this is a control system shift. Before agents, humans absorbed inefficiency. They resolved ambiguity inline, and the iteration cost was mostly cognitive. After agents, that no longer holds. Code generation is cheap, but convergence is not. Every inefficiency is amplified, every ambiguity shows up as another failed loop, and iteration cost becomes mechanical and multiplicative. Language choice is where this first shows up, but it isn't the root cause. The root cause is that the system, not the human, is now paying for correctness.

Python's slow iteration cycle, which a human engineer absorbs as background cost, compounds badly in an agentic context. The agent doesn't amortize the wait. It cycles through it repeatedly — each iteration slow, each one necessary to converge on something correct. The total tax is significant in a way that doesn't show up acutely when a human is doing it.

The Node supply chain problem was real before agents arrived; we knew about it, we just had nowhere better to go. What changed isn't the problem. What changed is that the destination became viable. The cost of the problem didn't go up. The cost of solving it went down.

The Scala corpus problem, which was an abstraction before, becomes concrete when you watch agents write Scala 3 code. The code is valid. It isn't idiomatic. Idiomatic matters — not as an aesthetic preference, but because the training corpus was built on idiomatic code. When agents drift from idiom, they drift from the patterns that work.

To compress it: Python didn't get worse. Node didn't get riskier. Scala didn't get smaller. But the cost of tolerating their weaknesses exploded under machine iteration.

That's the problem side. The opportunity side is that the same repricing opened a door we'd looked at before and walked away from. Rust was always a defensible backend choice on the merits — fast, correct, with a type system that actually constrains the solution space. What kept it off the table was the entry cost. The learning curve was steep enough that adopting it across a team felt like a multi-quarter investment before you got real throughput.

Agents changed that calculation. Not because the language got easier — it didn't. The borrow checker is still the borrow checker. Rust's ownership model is still demanding. What changed is the entry. Agents lower the waterline enough that you can wade in before you have to swim. A learning cliff becomes a slope. You engage with the hard parts sooner, at a higher level of abstraction, with more scaffolding already in place. The language doesn't get shallower. The beach does.

We ran the criteria honestly. When we listed what we were actually optimizing for, a significant portion of the list turned out to be criteria we wouldn't have included five years ago.

The Criteria

Iteration velocity to a correct result. How quickly does the agent converge on something that actually works? Rust's compiler is fast, and its errors are precise and actionable. Each iteration moves you forward. Python's toolchain is slow, and it takes more cycles to reach a sane result — each one slow, each one necessary. That compounds. The question isn't just how fast the feedback loop is; it's how reliably each pass through it closes the gap.

Typed to the core of the runtime. Not whether the language has types, but whether the runtime enforces them. Rust's type system isn't layered on top of a dynamic runtime; it's structural, all the way down. Python's types are gradual and advisory. TypeScript's types are erased before execution. In both cases, the correctness boundary is partially fictional: you have to re-establish it at system boundaries through validation or convention. A type system that the runtime doesn't enforce is a suggestion.

Type system as search-space reduction. Even among languages with real type systems, there's a question of how much the types constrain what the agent can do wrong. This is a different criterion from runtime enforcement — it's about the effect on agent behavior during generation. Rust's borrow checker enforces memory safety at compile time. Algebraic types force exhaustive handling. When an agent generates code against a type system that actively constrains the solution space, it isn't exploring — it's being guided. That guidance is a search-space reduction mechanism for machine reasoning, and it's the difference between types that document intent and types that narrow outcomes.

No magic. Can the agent see what the code actually does? Implicit behavior, annotation-driven wiring, and runtime reflection are all invisible surface areas that an agent can't reason about. Rust is explicit by default. What you see is what happens. That transparency isn't a stylistic preference; it's a prerequisite for machine reasoning. Agents navigate what they can see. This isn't an argument against abstraction; it's an argument against implicit abstraction. A proc macro is explicit; you can see that code is being generated. A framework that materializes behavior from annotations is not. The distinction matters because, at scale, no abstraction creates its own problem: fifty thousand lines of near-identical code with meaningful differences buried in the sameness is no easier to review than magic. The goal is the abstraction you chose, not the abstraction the framework imposed.

Training corpus quality. Is there enough idiomatic code in the wild that agents write good code, not just valid code? This is undersized as a criterion on traditional rubrics. Agents write very good Rust. The difference in output quality compared to a narrower corpus is observable and significant.

Supply chain integrity. Cargo's supply chain posture is categorically different from Node's. The baseline level of churn and CVE exposure is not comparable. And now that Rust is a viable destination, that difference is actionable in a way it wasn't before.

Language stability and governance. Does the language break its own contracts? The Scala 2 to 3 migration is a cautionary tale. Akka going closed source is another. A backend language is infrastructure, and infrastructure that requires invasive rewrites on someone else's schedule is a liability. Rust has been remarkably stable, and it remains fully open source under a governance model that shows no signs of change.

Organizational legibility. Can you defend the choice to people who don't evaluate languages for a living? This criterion is invisible on most technical rubrics and load-bearing in most actual organizations. A language decision has to survive a bad quarter.

These criteria didn't appear on the language evaluation rubrics we would have used five years ago. They're about what a language looks like to an agent iterating under compiler guidance, to a team learning under agentic scaffolding, and to an organization that has to explain its technical bets to non-technical leadership. Those are new inputs, and they changed the output.

Rust wasn't the only candidate. OCaml was on the shortlist, and on pure type-theoretic merits, you could argue it belongs there. Algebraic types, pattern matching, a compiler that enforces correctness, and the structural properties are real. But the ecosystem is split between competing standard libraries, and OCaml 5's removal of the global lock, while the right technical move, has put the runtime in a transitional state that hasn't fully stabilized. More fundamentally, a language decision in a non-engineering company has to survive scrutiny from people who evaluate outcomes, not toolchains. I've chosen what are considered esoteric languages for production before, in organizations like this one. Those choices came back to bite me, not because the languages failed, but because the first time anything went wrong, the language became the explanation. It doesn't matter if the problem is unrelated. The question "why did you choose that language?" becomes unanswerable once leadership believes it's the cause. Rust was esoteric five years ago. It isn't anymore. AWS uses it. The Linux kernel accepts it. It shows up on job postings that non-technical stakeholders can find. That organizational legibility is a real criterion, and pretending it isn't is how technical leaders burn political capital on fights they didn't need to have.

The answer to the criteria produced was Rust.

Does Craft Still Compound?

There's a question I can't fully leave alone: Does good code still matter in the way it used to?

My working answer is yes, and the agentic context is making me understand why for the first time.

We use snafu for error handling in Rust, a library that forces you to explicitly type your errors, encoding context and causality into the error structure itself. The error isn't a string. It's a structured claim about what went wrong and where, typed explicitly into the domain model. When an agent encounters that, it doesn't have to infer context from a message; the context is in the structure. The iteration is faster. The reasoning is more precise.

That's not a human-readability win. It's an information density win. Agents consume information density the same way humans do, maybe more voraciously.

And that density matters on the other side, too, not just generation, but review. Agents increase production rate. Strong types increase correctness. Iteration gets faster. The new constraint becomes human comprehension bandwidth. Can the person arbitrating the output actually understand what changed and whether it's acceptable, at the rate agents produce it? Information-dense code, expressive types, structured errors, and clear module boundaries aren't just easier to generate correctly. It's easier to arbitrate. The code carries its own argument for why it's right.

The flip side: sloppiness compounds, too. Stringly-typed interfaces, anemic domain models, errors that are just strings, those were always technical debt. But a human engineer could hold context across the mess, fill in gaps with experience, and navigate ambiguity. Agents fall into it. The debt becomes load-bearing in ways it never was before. And the human reviewing the output loses the one thing that made review tractable, the ability to see, from the structure alone, whether the code is doing what it should.

The things that made human engineers better, expressive types, meaningful error structure, clear module boundaries, and honest naming weren't just aesthetic preferences or readability conventions. They were information density. Precision about what the code means. And that precision was always doing more work than we could fully articulate.

The agentic era isn't changing what good code is. It's making the reason it was always right finally legible.

Repriced Criteria, Repriced Problems, Repriced Solutions

The Criteria

Does Craft Still Compound?

You might also like...

The Hoist

Code Used to Be a Moat

The Typewriter Defense

Planning in the Agentive Era

Humility breeds success