Lots of people—including lots of influential people working on AI—believe that AI’s economic impact will primarily come from the automation of R&D. However, Epoch AI researchers Ege Erdil and Matthew Barnett disagree.
Context
In their post “Most AI value will come from broad automation, not from R&D”, Erdil and Barnett argue that—spoiler in the title—general labor automation, not R&D specifically, will define AI’s economic impact.
First, they point to a broad fact: in general, R&D expenditures only account for 20% of U.S. labor productivity growth since 1988. Better management, knowledge diffusion, etc. account for the rest. It follows that if R&D isn’t even the majority contributing factor to growth, then its automation won’t be as economically impactful as some claim. My friend (rightly) argues here that TFP growth, which R&D accounts for half of, not just 20%, is actually the correct thing to look at. But regardless, this isn’t the article’s point.
What’s important are the two reasons why they think labor automation—not R&D automation—is going to be more economically impactful.
First, they think that a software-only intelligence explosion or singularity is unlikely—an interesting take given that many short timelines hinge on such an explosion occurring. They cite EpochAI’s previous work analyzing domain-relevant R&D returns to evaluate the plausibility of a software singularity, the conclusion of which is basically “ehh maybe, but maybe not?”.
They point out that this analysis assumes that researcher effort is the only input to R&D. However, if progress requires both researcher effort and data, and those two inputs are sufficiently complementary (elasticity of substitution < 1), then AI R&D automation might be significantly bottlenecked by hardware constraints. For reference, Oberfield and Raval (2014), estimate that the elasticity of substitution between labor and capital in the U.S. manufacturing sector is 0.7. This would be enough to snuff out a “software-only singularity” after “less than an order of magnitude of improvement in efficiency.”
Second, the authors argue that while current models are already useful for the abstract reasoning bits of scientific R&D, a substantial portion of R&D tasks require physical manipulation and advanced agency—skills that will not be automatable for a while. Basically, by the time that AI improves enough to perform well on these functions, it’ll just be a p. 11 side note on a newspaper headlined “AI Takes Over Labor Force: Public Panics!”
Summed up, these researchers want us to imagine AI progress in coming years not as datacenters full of super-geniuses launching us into Skynet before we can say “software singularity” five times fast, but instead as incremental, compute-driven improvements that spur automation of more and more labor tasks, accelerating economic growth.
My View
Overall, I am pretty sympathetic to this take. I wholeheartedly agree on the difficulty of full-stack automation of science R&D, and I am similarly uncertain about the plausibility of a software-driven intelligence explosion.
However, it’s unclear to me what specific timeline is implied by their predictions for “coming years.” Do they anticipate that a significant portion of labor will be automated in a few years? A decade? More?
I recognize that this is outside the scope of this research, but I think specifying the rate at which labor automation occurs is worth doing more work on. A lot more work.
Coming from an AI governance perspective, policy is incredibly time-sensitive, so probing into something more informative than “within this century, probably” would be helpful, in my humble opinion. I may or may not be slightly miffed about someone at an unnamed AI governance org advising me to steer my research away from the topic of voluntary standards because he has crazy short timelines, which it turns out is pretty relevant to the “private governance” ideas that have been floating around.
Anyways. The point is that I do appreciate work like this for trying to map out how AI might eventually transform the economy. Nevertheless, for all intents and purposes, it seems like waving your arms around politicians and yelling that something big is going to happen sometime soonish—without particular clarity as to when and how—has yielded little practical success.
Some part of this can be attributed to malign political incentives, but a good part of it is justified. Under democracy, directing the unwieldy instruments of law and bureaucracy requires overcoming significant inertia, so it’s important not to wake the beast until you know what you’re going to do, and are ready to actually do it.
Part of this preparedness is understanding not just the big picture importance of labor automation, but also modeling when and how it might happen—details that are essential to informing timely, concrete policy action.
Breaking Down the Model
One way that we can uncover these details is by breaking down the researchers’ model of labor automation, as presented in their post, into its constituent parts. For that purpose, I’ve hobbled together a simple visual representation of (my understanding of) their model.
Basically, they think AI-driven economic growth will look like this:
Rather than this:
That’s all well and good. However, once we lay it out like this, we can see that their main claim—no software singularity—doesn’t necessarily imply that labor automation isn’t going to end up being something closer to the second scenario. And of course, nothing about it indicates how slow or fast anything is going to occur. Below, I’ll examine each link and show how different progress dynamics could vary automation timelines, indicating areas for further research.
Current Models → Labor-Automating Agents
The first thing that strikes me is that, absent AI R&D automation and an intelligence explosion, I’m not sure how quickly scaling will bring us labor-automating agents.
Jagged frontier is real—I think there are some major deficits that make current models utterly unsuited to replace human employees, even if they’re high-performing in other areas. Some of these issues seem non-trivial and architecturally-rooted (e.g. lack of dynamic memory & limited context window), so it’s not obvious they’ll just go away with more scaling, and it’s unclear how close or far we are from solving them with scaffolding or techniques.
Again, in a political context, “close” and “far” can be relatively compressed. A few months vs. a few years makes a significant difference in practice.
Current Models → AI R&D Agents
From this, another implication may follow. If compute alone is sufficiently slow at bringing about labor-automating agents, then—even assuming diminishing returns—automating AI R&D might still be a better vector for advancing labor automation than simply scaling up.
Sure, if r < 0.7, for example, then we’ll only see an order of magnitude of improvement before kaput. But still, it isn’t crazy to think that creating millions of sleepless 10x AI researchers first is going to be a faster path to labor automation than just letting Sam Altman go “uhhhh have we tried throwing more compute at it???”.
If that is the better vector, then labor automation timelines depend on AI R&D automation timelines. To me, making AI R&D agents seems easier than making labor-automating agents—software experiments are much easier for models to validate and receive feedback on compared to general corporate activities.
However, the jagged frontier might remain problematic, and therefore, labor automation timelines might still depend on how soon these issues can be solved.
If we expect this to happen soon, and if we expect “merely 10x” agents to still be pretty cracked, then widespread labor automation might happen soon and fast. On the other hand, if we expect that the roadblocks to creating AI R&D agents will turn out to be pretty hard to solve, and if we expect that the resulting agents will be unimpressive anyways, then labor automation timelines might get stretched out over decades. And of course, it could be a mix of these two scenarios.
AI R&D Agents → Labor-Automating Agents
Earlier, I stated that if we get good AI R&D agents quickly, then we’ll get labor automation quickly. However, that’s not necessarily true.
The authors highlight the importance of the complementarity between data and researcher effort. In the context of their argument, this basically means that automating AI R&D isn’t just about running a lot of AI R&D agents—it’s about running a lot of (potentially costly & time-consuming) software experiments to gather data, which introduces another compute constraint. This is a pretty contentious point in arguments over software take-off.
But here, my claim refers to data on labor tasks (rather than software experiments), with the constraint being capital-intensive data collection as opposed to compute. LLMs are trained off of the entire internet corpus, which is great for learning how to produce smart text outputs, but maybe not-so-great for learning how to efficiently navigate a desktop UI.
Perhaps AI R&D agents will be very sample efficient, but perhaps this sample efficiency won’t generalize to tasks outside AI R&D itself—kinda like how a professional dancer can perfectly replicate a dance move after one viewing, but wouldn’t be able to do the same with a violin piece.
If that’s the case, then perhaps AI developers would still need to collect a lot of novel data, for example, by paying workers for recordings of them solving long-horizon tasks, as suggested in AI 2027.
But this sort of data collection seems like it could be quite time-intensive and costly, potentially dampening capabilities progress. If this is the case, then even if we get AI R&D agents soon, this does not get us labor-automating agents quickly after.
Labor-Automating Agents → Economic Growth
Finally, we can consider the link between labor automation and economic growth. In the absence of an intelligence explosion, the authors model labor-automating agents as having an incrementally expanding set of labor tasks they are capable of doing, with deployment across the economy growing as this set expands.
However, it seems strange to me to treat labor tasks like a homogenous mass. The types of tasks that become automatable and the allocation of those tasks across the labor force seem to matter a lot in predicting the path of automation and economic growth.
For example, compare (a) automating 5% of tasks across 80% of jobs, versus (b) automating 100% of tasks across 4% of jobs. In both cases, you’re affecting the same percentage of total tasks (4%), but the effects of each could look very different, especially depending on what those 4% of jobs are in (b).
In (a), where productivity improvement is diffuse, it seems like those gains could be easily offset by behavioral compensation from employees (e.g. “Well since the agent automates this bit of my work, I can slow down on everything else.”), or stickiness in firm behavior (e.g. being insensitive to low levels of gain). This would not only dampen overall growth, but could also reduce investment in AI if adopting companies see disappointing results.
Another related dynamic here is if human supervision continues to be necessary even as models become more capable. Models already perform better and faster on certain white collar tasks, but they tend to lose the plot during long-horizon workflows, getting stuck in silly action loops or catastrophically forgetting obvious information.
Although human input in this situation would be highly complementary and therefore valuable, I’d imagine a lot of people would be very scared witnessing their jobs getting gradually degraded down to LLM babysitter, without any guarantee that they’ll be needed at all in the future.
This could cause sabotaging behaviors to emerge, which would also dampen growth, investment, and automation timelines. People are already getting antsy about AI and “quiet quitting” or whatever is an existing trend, so I wouldn’t be surprised by this outcome.
On the other hand, if in (b) those 4% of jobs happen to be vertically integrated in a particular industry, this could give rise to AI-driven natural monopolies that produce rapid, unprecedented market concentration, as well as spillover effects in related industries.
This would not only boost growth but also amplify excitement about AI, driving further investment in and improvement of labor automating agents. Plus, wholesale replacement means you won’t need to worry about sabotage or other human resistance, so automation timelines further accelerate.
Conclusion
When you break down this model of labor automation into its constituent links, there turn out to be many unanswered questions, the best guesses of which could squeeze timelines into two years or stretch them out over two decades (and this is just predicting the start, not to mention the rate).1 Even if you don’t buy the intelligence explosion, it also still remains unclear whether scaling or AI R&D automation will mediate labor automation. This ambiguity is hard to usefully act upon, policy-wise, and while I recognize that we’re obviously never going to be able to perfectly predict the world, I think we’re still clearly clueless enough that it’s worth doing more work on.
Or maybe I just haven’t personally seen attempts to answer these questions. Please link relevant work if that’s the case!
Very good read—I do think the relevant question is whether automating AI improvement is easier than automating jobs.
Because it does seem automating all jobs is hard (for all the reasons you point out). And so unless automating AI improvement is easier, longer timelines seem reasonable.
Great post, Michelle!