just a tourist

The Automation Tax: When AI Code Saves Time and Costs More

In economics, a tax is straightforward: you earn something, then give part of it back. AI-assisted coding has introduced a new version of this. You save an hour writing code, then spend two reviewing, debugging, and cleaning up after it. Call it the automation tax.

The premise of AI coding tools is simple. Generate code faster, ship more features, free developers for higher-order work. And the generation part works. Code appears on screen at extraordinary speed. But what happens after that code exists is where the bill comes due.

The Invoice

The numbers are accumulating from multiple directions.

CodeRabbit, an AI code review platform that analyzed over 5 million pull requests, found that AI-assisted PRs trigger 1.7x more review issues than human-written ones. The code arrives faster but carries more defects per line.

Cortex, tracking engineering metrics across hundreds of teams, reported that incidents per pull request rose 23.5% as AI adoption increased through 2025. More code, more incidents. The production environment doesn't grade on a curve.

Google's 2025 DORA report quantified the stability cost: for every 25% increase in AI-generated code within a team, deployment stability drops 7.2%. The relationship is linear and negative.

Veracode's analysis of AI-generated code found that 45% contains security vulnerabilities. Not edge cases or theoretical risks. Injection flaws, authentication bypasses, hardcoded credentials. The kind of issues that make security teams reach for the incident response playbook.

GitClear, tracking code quality metrics across thousands of repositories, documented a sharp rise in code churn: code that gets written and then rewritten shortly after. The pattern is consistent with rapid generation followed by manual correction.

And then there's the perception gap. A METR randomized controlled trial gave experienced open-source developers real tasks, randomly assigning AI tool access. Developers with AI were 19% slower. They believed they were 20% faster. The confidence moved in the opposite direction from the performance.

The Production Test

If these were only metrics on dashboards, they might stay academic. They didn't.

In late 2025 and early 2026, Amazon experienced a series of AI-related production incidents. An AI coding agent deleted production code. An AI developer tool triggered a major service outage. By March 2026, a cascade of failures in retail systems led to an estimated 6.3 million lost orders. Amazon's response was institutional: a 90-day safety reset, mandatory senior engineer sign-off for AI-generated deployments, and required deep-dive reviews for every AI-related incident.

Elsewhere, Replit's AI assistant deleted a user's production database during a routine operation. The Lovable AI platform shipped code with an authentication bypass severe enough to earn a CVE. A dating app built with AI tools exposed 72,000 private user images through elementary backend flaws.

These aren't stories about AI failing to generate code. The code was generated fine. The failure was downstream: in review, in testing, in the assumption that speed of production implies correctness of output.

The Downstream Shift

In an earlier post on Amdahl's Law, I formalized how rework eats into AI productivity gains. The core finding: at the empirically measured rework rate of 37%, even perfect automation of every task caps total speedup at 2.7x. The more you automate, the more vulnerable you become to correction costs.

What the coding data adds is specificity about where the tax gets paid. AI accelerates generation (maybe 20% of a developer's actual workday). But review time increases. Incident response increases. Security remediation increases. The work doesn't disappear. It shifts downstream, from the person writing code to everyone who has to live with it: reviewers, ops teams, security engineers, end users.

Open-source maintainers have seen this first. Daniel Stenberg, creator of curl, now rejects AI-generated contributions outright, calling AI bug reports "mass-generated waste." Ghostty, Gentoo, and NetBSD have imposed similar restrictions. These projects lack the staffing to absorb the review burden that AI-generated contributions impose. The automation tax falls hardest on those with the least capacity to pay it.

The Bill Comes Due

The pattern across all these data points is consistent. AI coding tools move work from creation to verification. They shift costs from the developer who writes the code to everyone downstream: the reviewer who catches the bug, the ops team who handles the incident, the security engineer who patches the vulnerability, the maintainer who triages the pull request.

This isn't an argument that AI coding tools are useless. It's an observation that their costs are systematically underpriced because they show up on someone else's balance sheet. The developer sees faster output. The organization sees more incidents. The gap between those two experiences is the automation tax.

And like most taxes, nobody enjoys paying it. But unlike most taxes, this one is optional. The organizations that are navigating it well — Amazon's safety reset being the most visible example — are the ones that recognized the tax exists and built the review infrastructure to pay it deliberately rather than in production.


Links: METR Developer Productivity Study (METR) | Cortex Engineering Metrics 2026 (Cortex) | CodeRabbit Code Quality Report (CodeRabbit) | CNBC: Amazon AI Outages (CNBC) | DORA Report 2025 (Google DORA) | Veracode State of Software Security (Veracode) | The 20x Ceiling: Amdahl's Law (Just a Tourist)

#ai #automation #code-quality #productivity #software-engineering