Picking a Language for Agents

April 30, 2026 - 7 mins read

This is the final part of a 5-part series on agentic engineering. Part 4 laid out architectural principles for choosing languages and runtimes. This post puts a handful of worked examples through those principles, just to show what the analysis looks like in practice.

This isn’t a list of “the languages I use”, and it isn’t exhaustive. It’s a deliberate move away from personal taste — what I happen to know, what’s fashionable, what the team is already comfortable with — and toward something more objective: which language properties best fit the agent and the problem domain in front of you. Different problems will pull you toward different answers, and the examples below are picked specifically to make that point.

The heuristics from the previous post still apply: strong typing, fast compile/test loops, simple language surface area, low dependency risk, batteries-included ecosystem, good fit with the harness. With those in mind, here are some illustrative examples.

TypeScript

Still a strong default, especially on the client.

Pros

widely represented in training data — agents are very fluent here
strong typing
huge ecosystem
natural fit for web products

Risks

high supply-chain risk (using Bun or Deno helps, but doesn’t eliminate this)
fragmented tooling — Bun or Deno cuts through most of this
single-threaded

Worth noting: Claude Code itself uses Bun + TypeScript. Strong typing, fast feedback, and the largest training corpus in existence is a meaningful combination for the agent — particularly for client-side or full-stack web problems where the domain happens to align with that ecosystem.

Kotlin

Solid for service work where the JVM is the right runtime.

Pros

strong typing
mature JVM ecosystem
good for service architecture
excellent testing and HTTP support

Risks

slower feedback loops than I’d like
more complex types == more complex error messages (and more places agents can get tangled)
JVM overhead
medium supply-chain risk (better than JS, not great)

Kotlin fits well when JVM context is already a hard requirement and the problem benefits from a typed, expressive language with mature libraries. From a purely agent-first perspective, the feedback-loop story is the weakest part of the picture — agents pay for slow compile/test cycles in retries and tokens.

Python (with a type checker)

Highly effective for AI/ML, data, and evaluation-heavy systems.

Pros

dominant ecosystem for ML, AI, and data workflows
fast iteration, low ceremony
excellent for experimentation, evaluation, and internal tooling
widely represented in training data

Considerations

requires real discipline to avoid runtime issues
supply-chain risks similar to JS — large transitive dependency graphs are common

Python is the natural fit where the system interacts closely with models, data, or evaluation loops — the ecosystem is doing real work for you. In an agentic setup it should be paired with mandatory type checking (mypy, pyright). Untyped Python in an agent loop is asking for trouble — every type bug becomes a round-trip cost.

Go

Strong agent-fit for server-side problems with a simple shape.

Pros

very fast compile times
simple language
strong typing
fast runtime
excellent standard library
low cognitive overhead
low supply-chain risk (few dependencies is the norm because the standard library is so good)

Risks

less expressive
verbose error handling
which leads to higher token usage

Go ticks an enormous number of agentic boxes — fast feedback, simple surface area, batteries included. The main cost is verbosity, and verbosity in an LLM world is paid in tokens. The trade-off depends on how chatty the problem domain is.

Elixir

Strong choice for distributed, concurrent, and fault-tolerant systems.

Pros

excellent model for many small, independent processes
built-in supervision and fault tolerance (BEAM)
well suited for messaging, orchestration, and real-time systems
encourages simple, isolated components
low/medium supply-chain risk (fewer deps than JS/Python/Rust)

Considerations

typing story is improving, but still less strict than the alternatives
smaller ecosystem

Elixir is interesting because the runtime resembles an agentic model itself — many workers, message passing, retries, supervision. For problem domains that match that shape, it reduces the amount of custom orchestration and resilience machinery you’d otherwise have to build. (Reportedly part of why OpenAI ended up using it.)

Zig

Interesting where low-level control or performance matters — WebAssembly, hardware, native apps.

Pros

explicit and simple language
good compile times
exceptional multi-platform support (including WebAssembly)
extremely low supply-chain risk — dependencies are explicit and there are no transitives
exceptional runtime performance

Risks

pre-1.0
small ecosystem (offset by the rich C ecosystem it can use directly)
not as strong guarantees as Rust
terrible fit for web services (great for WebAssembly or native apps)

There’s a reason Bun chose Zig over Rust even with its pre-1.0 status. I’ve also used it on a recent personal project — for a problem in its sweet spot, fast compile times and minimal runtime dependencies do real work for both the agent and the human.

Rust

Powerful, but potentially expensive for agents.

Pros

extremely strong guarantees
memory safety
good WebAssembly support
excellent C++ support
exceptional runtime performance

Risks

extremely slow compile times
complex language model
higher failure/retry rate in LLM code generation
higher token and iteration cost
real supply-chain risk — massive transitive dependencies are the norm, like JS

Rust has real strengths and real costs from an agentic perspective: every retry is expensive in compile time, the type system is rich enough that errors can spiral, and the dependency graphs grow alarmingly fast. The case for Rust gets stronger the more the problem domain demands what it’s actually good at — correctness, performance, FFI — and weaker the further you drift from that.

The shape of the evaluation

The point of going through these examples isn’t to land on a list of approved languages. It’s to make the evaluation explicit. For any new piece of work, the question I want to ask out loud is roughly:

What does the problem actually need? Concurrency model, performance, ecosystem fit, deployment shape, integration constraints.
What are the agent’s strengths and failure modes here? Compile speed, type system, training-data fluency, retry cost, error-message quality.
Where do those two answers overlap?

Whatever sits in the overlap is what you should be reaching for, even if it’s a language nobody on the team has used before. Whatever doesn’t sit in the overlap should make you uncomfortable, even if it’s the language you’ve used for ten years.

There’s a slightly mischievous version of this worth saying out loud: deliberately picking a language you don’t know well might actually be useful. The whole shift in the series is humans doing less reading and writing of code and more steering of intent and constraints. If you choose a language you’re already fluent in, the temptation to drop down into the editor and “just fix it yourself” is enormous. Choose one you can’t fluently read, and that temptation evaporates. You’re forced to stay in the steering seat. That’s not a reason to pick badly — agent fit and problem fit still come first — but among otherwise-equal options, “I won’t be able to resist meddling in this one” is a real argument against a familiar language.

The honest version of all this is the part I find hardest. Most language choices in my career have been driven by personal taste or team familiarity. Agentic engineering forces a more boring question: given this agent and this problem, what’s the right tool? Sometimes that answer is the one I’d have picked anyway. Sometimes it isn’t, and that’s the more interesting case.

A note on what I’m not optimising for

A few things I deliberately don’t put much weight on any more:

Team familiarity. Less of a constraint than it used to be. Agents can write competently in unfamiliar languages; humans can read them with help.
Library count. What matters is whether the standard library and a small set of vetted dependencies cover the problem. Big ecosystems are often a liability in an agentic context.
Trendiness. Boring runtimes are good. Boring runtimes have predictable failure modes.

Wrapping up the series

If you’ve read all five posts, the core argument is fairly simple:

Keep the Agile values. Change the mechanisms.
Pairing becomes co-steering. Humans mob on intent, agents mob on execution.
Quality emerges from the harness, not from human discipline alone. Improve the system, not the output.
Architecture gets re-evaluated against agent legibility, fast feedback, and safe deployment.
Languages get re-evaluated against the same criteria — chosen for fit between the agent and the problem domain, not for taste.

The throughline: we’re building systems that build software. The interesting craft has moved up a level. I’m still learning how to do it well, and I fully expect to be wrong about a lot of this by tomorrow, or next week, or this time next year. That’s the name of the blog after all.

This is a post in the Agentic Engineering series.
Other posts in this series:

30 April 2026 - Picking a Language for Agents
29 April 2026 - Agentic Architecture
28 April 2026 - Harness Engineering: Building a Factory for Code
27 April 2026 - From Pair Programming to Co-Steering
26 April 2026 - From Agile Engineering to Agentic Engineering