Contra Matt Shumer on AI

Disclaimer: These are my views, not my employer’s; I have no relevant non-public information.

Last week, Matt Shumer published ‘Something Big Is Happening’ wherein he reviewed the recent progress in AI, forecasted future developments, and warned industry outsiders about the potential impact (especially to their careers). This post garnered a lot of attention across X, LinkedIn, and (for some reason) Fortune; Axios described it as “mega-viral”.

Matt makes several high-level points I agree with:

He also makes some dramatic predictions for the near future using reasoning I find specious. Specifically, he extrapolates from recent AI progress in software engineering to forecast a growth in general model capabilities that in my opinion disregards the mechanics of how those models are developed. If this is intentional oversimplification to land his message of economic impact with a non-expert reader, I believe it goes too far.

Matt begins with a thematic callback to the pre-calamity days of February 2020 before explaining his motivation:

Here’s the thing nobody outside of tech quite understands yet: the reason so many people in the industry are sounding the alarm right now is because this already happened to us. We’re not making predictions. We’re telling you what already occurred in our own jobs, and warning you that you’re next.

After recounting the rapid evolution of AI coding tools over the last three years and highlighting the impressiveness of the latest models, Matt explains why these tools are a harbinger of wider future disruption:

The AI labs made a deliberate choice. They focused on making AI great at writing code first… because building AI requires a lot of code. If AI can write that code, it can help build the next version of itself. A smarter version, which writes better code, which builds an even smarter version. Making AI great at coding was the strategy that unlocks everything else. That’s why they did it first. My job started changing before yours not because they were targeting software engineers… it was just a side effect of where they chose to aim first.

I would say this is so incomplete as to be false. Programming is extremely well-suited as a target capability for large language models for a number of other reasons:

Eliding the critical roles of pre-training (and therefore training data) and reinforcement learning (and therefore an environment which facilitates scoreable tasks) in developing model capability provides the false impression to an uninformed reader that any discipline could be as easily targeted.

Furthermore, presenting recursive self-improvement (where a model at first aids in and later autonomously manages the creation of a model more capable than itself) as something simply ‘unlocked’ by model coding ability is a ridiculous overstatement.

Developing a large language model is a fundamentally different task than writing a computer program, even a complicated one like a compiler or operating system. While powerful new programming tools will certainly aid the software engineers who build these models, it’s not clear to me how this aid would be a compounding factor rather than a linear productivity boost.

Matt reiterates this point in a later section:

On February 5th, OpenAI released GPT-5.3 Codex. In the technical documentation, they included this:

GPT-5.3-Codex is our first model that was instrumental in creating itself. The Codex team used early versions to debug its own training, manage its own deployment, and diagnose test results and evaluations.

Read that again. The AI helped build itself.

This isn’t a prediction about what might happen someday. This is OpenAI telling you, right now, that the AI they just released was used to create itself. One of the main things that makes AI better is intelligence applied to AI development. And AI is now intelligent enough to meaningfully contribute to its own improvement.

To me, the tasks that OpenAI mentions are familiar capabilities of existing models: they can write excellent code, interact with command-line interfaces, and analyze moderately-sized datasets. Further investigation of the press release (‘How we used Codex to train and deploy GPT‑5.3‑Codex’) corroborates this assessment: all the examples provided are of humans utilizing a tool for an isolated task to accelerate their work.

Again, the tools that exist today are powerful! For software engineering they are transformative! But everything in that announcement seems to fall under the banner of linear rather than compounding improvement. While recursive self-improvement is certainly part of the possible outcome distribution, I don’t think the available evidence justifies framing it as ‘events already in motion’.

However, it is exactly that framing that contextualizes Matt’s boldest predictions:

[Anthropic CEO] Dario Amodei, who is probably the most safety-focused CEO in the AI industry, has publicly predicted that AI will eliminate 50% of entry-level white-collar jobs within one to five years. And many people in the industry think he’s being conservative. Given what the latest models can do, the capability for massive disruption could be here by the end of this year. It’ll take some time to ripple through the economy, but the underlying ability is arriving now.

This is different from every previous wave of automation, and I need you to understand why. AI isn’t replacing one specific skill. It’s a general substitute for cognitive work. It gets better at everything simultaneously.

I think this is jumping the gun and that the “general substitute for cognitive work” threshold has not been crossed. It might happen in the next few years, or it might not.

To a layman, I would explain the recent history of AI progress like this: the largest development of the last decade was the discovery that training a model on massive amounts of data created text-generation capabilities that exceeded the performance of models that were trained on less data but refined for a particular task (translation, summarization, etc.).

One model trained on a lot of data could be better at all of these tasks than each individual, fine-tuned model. This was a bit of a magical discovery.

Currently, the most promising avenue for making a similar jump in general capability is scaling up the reinforcement learning process. Can you achieve capability across arbitrary tasks (like a human’s general cognitive and functional ability) by using a large and diverse set of reinforcement learning scenarios?

If so, then “massive disruption” of existing economies is likely. This is staggering to consider in detail, though even in this scenario my predictions are more moderate than what Matt describes. For an overview of some possible mitigating factors I recommend this article from Epoch AI.

If not, then the economic impacts will be more gradual and more piecemeal: the work in individual industries will evolve significantly but without a dramatic upheaval outside of specific niches like translation and illustration that are more easily commoditized.

If anything, I think Matt’s piece underremarks on the significance of already-available capabilities. Even if another step-change in AI capability does not materialize, most of his recommendations are still relevant.

In my view there is significant practical value to be gained simply from optimizing existing models for new use cases, creating structures for orchestrating agents, and packaging this functionality so that it can be applied by domain experts in diverse industries.

Until there is clear evidence of progress in generalized model capability, this is the area where I would recommend focusing.