Non-coders succeed at coding tasks at nearly the same rate as software engineers, on average, when using Claude Code. That’s a direct finding from Anthropic’s research, drawn from approximately 400,000 sessions across roughly 235,000 unique individuals, per Anthropic’s research summary. The time period: October 2025 through April 2026.
Seven months of data. It’s the first large-scale empirical look at what actually happens when a diverse population, not just developers, uses an autonomous coding agent at scale.
The expertise story is more nuanced than it first appears. The study confirms that greater domain expertise correlates with more work done by Claude per instruction and with higher session success rates. According to the paper’s full dataset analysis, expert users achieve approximately a 91% task success rate compared to approximately 15% for novices, though the paper explicitly notes that the gap between intermediate and expert users is modest. Don’t read that as a simple binary. The gains from expertise are real, but they’re concentrated at the extremes.
The aggregate pattern across the seven months is clear. Usage shifted toward more end-to-end agentic execution, deploying and running code, analyzing data, writing non-code documents, and away from debugging. The share of sessions spent debugging fell by nearly half. The paper reports the estimated economic value of typical tasks rose over the study period; the full paper cites approximately 25%, which could not be confirmed from the published research page.
Who This Affects
What this means for engineering teams
The non-coder parity finding is the one that will generate the most discussion, and it deserves precision: “nearly the same rate as software engineers, on average” applies to average tasks. Not advanced tasks. Not tasks requiring architectural judgment or system design. The word “average” is load-bearing here. It means routine, well-scoped work, the kind that fills a significant portion of most engineers’ days but doesn’t define their career value.
The expertise multiplier finding points in the opposite direction. The more domain knowledge someone brings to a Claude Code session, the more work Claude does per instruction. Experts aren’t being replaced. They’re being amplified, getting more output from the same input. That’s a different story than displacement. It’s a productivity story, and the implications for how engineering teams should be structured are not yet settled.
This study is the first major output of Anthropic’s $200M commitment to studying AI’s economic effects, announced in June. The research covers Claude Code specifically, findings shouldn’t be generalized to all agentic coding platforms without comparable empirical data from those tools.
Analysis
The gap between intermediate and expert users is modest, per the source. The gap between novices and experts is large. That combination means the biggest productivity gains from Claude Code go to people who bring substantial domain knowledge, the tool amplifies expertise more than it substitutes for it. That's the finding that should anchor workforce planning conversations.
What to watch
Three forward signals worth tracking: First, whether other labs publish comparable empirical studies on their coding agents. A single study from the developer of the tool being studied is a starting point, not a conclusion. Second, enterprise HR and compliance teams will start using this data to evaluate workforce change obligations. The study documents capability shifts, not headcount reductions, but that distinction gets lost in summary form. Third, the seven-month trend line on debugging time and agentic use is moving fast. Where it sits at twelve months will be more telling than where it sits today.
TJS synthesis
Anthropic’s study doesn’t tell you whether AI is replacing software engineers. It tells you that domain expertise remains the multiplier, that non-coders can now reach average-task parity, and that the nature of work is shifting toward agentic execution and away from debugging. For engineering managers, that’s a staffing and training question. For compliance teams, it’s a workforce change assessment question. Wait for comparable studies from other platforms before treating these numbers as industry-wide fact.