Is English really the new programming language?

No — that slogan confuses the medium with the skill. English is how you talk to the model, but describing what you want is not the same as knowing what to hand off, how to specify it, how to verify the output, and who is accountable when it ships. That bundle of judgment has a name — AI fluency — and it is what separates people who get real leverage from these tools from people who get plausible-looking garbage. Everyone has the same models; the difference is four learnable habits, not vocabulary or access.

AI fluency is the skill of operating an AI tool well, as distinct from merely prompting it. It is four habits practiced together: Delegation (deciding what to do yourself, hand off fully, or do together), Description (specifying the request precisely enough to get usable output), Discernment (evaluating the result before you trust it), and Diligence (owning the outcome because your name is on it). Fluent users stop fighting the tool — typing, deleting, re-prompting — and start pointing it at the right target, describing it cleanly, checking it fast, and signing off on work they can defend.

Does AI actually make you faster?

Not as a fixed fact — it amplifies a decision. In METR's 2025 randomized trial, experienced developers working on codebases they maintain were measured 19% slower with AI, even though they predicted a 24% speedup. On standardized, well-documented tasks, a controlled trial found GitHub Copilot users finished about 56% faster, with the least experienced gaining most. Same tools, opposite results: the variable is what you point the AI at and what you do with what comes back — which is exactly the skill of AI fluency.

English Is Not the New Programming Language — AI Fluency Is

In January 2023, AI researcher Andrej Karpathy posted a line that went everywhere: “The hottest new programming language is English.” It got compressed into a slogan — English is the new programming language — and a promise: if you can describe what you want, you can build it.

It’s wrong. Or more precisely, it confuses the keyboard with the craft.

English is the medium. It is not the skill. The skill is knowing what to hand off, how to describe it, how to check the result, and who’s accountable when it ships. That skill has a name — AI fluency — and the gap between people who get real leverage from these tools and people who get garbage isn’t talent, vocabulary, or access. Everyone has the same Claude, the same GPT, the same Gemini. The difference is whether they’ve built four specific habits.

This is for you if you use AI every day and quietly suspect that “just prompt better” isn’t real advice. If you’re hunting for magic prompt templates, this isn’t that — the skill was never the words.

The proof that the tool isn’t the skill

In 2025, the research group METR ran the most rigorous trial we have: 16 experienced open-source developers, working on codebases they maintain, completing real tasks with and without AI. They predicted the AI would make them 24% faster. Measured, they were 19% slower. Same developers. Same celebrated tools. Slower.

Meanwhile, on standardized, well-documented tasks — boilerplate, a fresh web server — a controlled trial found GitHub Copilot users finished about 56% faster, with the least experienced developers gaining the most.

Read those two results together and the slogan collapses.

AI doesn’t make you faster or slower as a fixed fact. It amplifies a decision: what you point it at, and what you do with what comes back.

That decision is the craft — and it has four moves.

A branching diagram titled 'Same tool. Two outcomes.' From a single black box labelled 'the same AI — Claude, GPT, Gemini,' two paths fork. The left path, 'without fluency,' loops through Prompt, Garbage out, and Reprompt back to Prompt — annotated 'times ten, still garbage' — and ends at 'Give up. Do it by hand.' The right path, 'with fluency,' runs Delegate, Describe, Discern, Diligence and ends at an oxblood node, 'Ship. Defensible.' Two stat cards read '19% slower — experienced devs on mature code, who predicted plus 24% (METR RCT, 2025)' and '56% faster — standardized task, juniors gain most (GitHub Copilot RCT).' Footer: the variable isn't the tool, it's fluency. — Figure 01Same AI, same access. One path loops — prompt, garbage, reprompt, give up. The other operates the tool through four moves and ships. The measured effect of the tool itself ranges from 19% slower to 56% faster; the variable is the person.

Move 01Delegation — decide what’s yours, what’s the AI’s, what’s shared

What it is. Before you prompt, decide the split. Three buckets: do it yourself, hand it fully to the AI, or do it together — you drive, it drafts.

The failure mode. Most people hand off too much, then own none of the result. They paste a vague ask, accept the first plausible answer, and ship something they couldn’t defend.

The move. For every task, ask “if this is wrong, who pays?” High stakes with your name on it → do it together, never a full hand-off. Low stakes and easily checked → delegate fully and move on. The METR slowdown is what over-delegation looks like at scale: experts let the tool drive on terrain they knew better than it did.

Move 02Description — specificity in, quality out

What it is. The quality of the output is capped by the precision of the request. It’s the oldest rule in tech leadership, and it didn’t change: if you’re not explicit about what you expect, it doesn’t get done — with people or with AI.

The failure mode. “Write me a function to clean the data” → generic output → reprompt ten times → give up and do it by hand.

The move. Give the AI what you’d give a competent contractor: the goal, the constraints (format, length, libraries, style), an example of what “good” looks like, and what to do at the edges. Front-load that context once instead of correcting it ten times. Most “the AI is useless” moments are actually under-specified requests.

Move 03Discernment — verify before you trust

What it is. Evaluation is its own skill, separate from generation. Confidence is not correctness — AI states wrong answers in exactly the same tone as right ones.

The failure mode. You can’t catch an error you don’t have the knowledge to see. This is why offloading everything is a trap: skip the work and you skip the understanding, and then you can’t tell good output from plausible-looking garbage. The leverage quietly inverts.

The move. Decide your check before you read the answer — run the code, test the edge case, verify the cited source, sanity-check the number against something you already know. If you have no way to evaluate the output, you delegated the wrong task (go back to move 1).

Move 04Diligence — your name is on it

What it is. The AI did the work, but you’re accountable for it. Ownership is the one thing you cannot delegate.

The failure mode. “The AI wrote it” is not a defense your boss, your client, or your reader will accept. Most so-called AI failures are diligence failures — nobody owned the last mile.

The move. Treat every AI output as a draft from a fast, confident, occasionally-wrong junior. You’re the senior reviewer who signs off. Before it ships, you’ve read it, you understand it, and you can defend every line as if you wrote it — because where it counts, you did.

How you know you have it

Watch someone fluent and the four moves disappear into a rhythm. They’re not fighting the tool — typing, deleting, re-prompting, sighing. They’ve stopped prompting and started operating: pointing the capability at the right target, describing it cleanly, checking it fast, owning the result. The tool becomes an extension of their judgment, not a replacement for it. That’s fluency. It looks like flow — and it’s the exact opposite of the ten-prompts-then-give-up loop.

The one thing to remember

Two pianists, one Steinway. One plays scales; one plays Rachmaninoff. The keys aren’t the skill.

English handed everyone the keyboard. What separates the output is what you do next:

Delegate deliberately — decide what’s yours, what’s the AI’s, what’s shared.
Describe precisely — specificity in, quality out.
Discern ruthlessly — verify before you trust; confidence isn’t correctness.
Take diligent ownership — your name is on it, always.

A ladder diagram titled 'AI fluency is four moves,' subtitle 'the craft behind just talk to the AI — run them in order, every time.' Four numbered boxes stack top to bottom. 1, Delegation: decide the split, what's yours, the AI's, what's shared; the move is to ask if this is wrong, who pays; the trap is handing off too much then owning none of the result. 2, Description: specificity in, quality out; the move is to give it the goal, the constraints, an example of good, and what to do at the edges; the trap is a vague ask, generic output, reprompt ten times, give up. 3, Discernment: verify before you trust; the move is to decide your check before you read the answer; the trap is mistaking confidence for correctness, you can't catch an error you can't see. 4, Diligence: your name is on it; the move is to treat output as a draft from a fast, confident, sometimes-wrong junior, you're the senior who signs off; the trap is that 'the AI wrote it' is a defense no one accepts. They terminate in an oxblood node: 'Ship — and defend it. Work you understand and can defend, line by line.' — Figure 02The four moves as a craft: each one’s job, the move to practice, and the trap it keeps you out of.

Build those four and you stop arguing about whether AI makes you faster. You just get more done — and you can defend all of it.

Sources

1
Quote Investigator — provenance of Andrej Karpathy’s “The hottest new programming language is English” (X / Twitter, Jan 24, 2023; verified Oct 20, 2024).
2
METR — Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity (arXiv:2507.09089, Jul 2025) — experienced maintainers were ~19% slower with AI on familiar codebases despite predicting a ~24% speedup.
3
Peng et al., Management Science (INFORMS, 2025) — The Impact of AI on Developer Productivity: Evidence from GitHub Copilot — Copilot users completed a standardized task ~56% faster; least-experienced developers gained most.