More anecdata + agreement wrt robustness for careers:
The people I know who are making big career choices in the shadow of transformative AI seem to be making choices pretty robust to uncertainty. It's more "I had an AWS SWE offer right out of undergrad, but I decided to move to the Bay and do a mechinterp startup instead," than "I'm telling everyone I love to FUCK OFF and joining a cult."
Suppose it's 2040, and Nothing Ever Happened. The person who turned down the AWS offer because she wrongly believed Something Would Happen now has no retirement account and no job (the mechinterp startup is dead; there's nothing to interpret). Where does that leave her? In the same boat as 4 in 10 Americans[1], and probably with at least one economically-useful skill. (Startups will probably still exist in 2040). She's certainly counterfactually worse off than if she'd gone to Seattle instead of the Bay... but not so worse off that I'd be comfortable calling her plans "not robust to uncertainty."
If the pivoters I know are wrong, they don't enter Yann LeCun-world with a couple embarrassing-in-hindsight substack comments and zero career capital — they enter with a couple embarrassing-in-hindsight substack comments and pretty-good career capital. Maybe titotal and I (and you) just disagree about what "robust" means?
> A few years of reduced savings or delayed family planning seems like a fair hedge against the real possibility of transformative AI this century.
I think my issue with this is the assumption that we'll actually know the state of play in a few years. Obviously if AGI happens and we hit superhuman coders in 2027, we'll know by 2027. But if we don't, there's no reason to assume that we won't be having the exact same debate in 2027. It might get even worse as we get closer to AGI; for example, we can imagine a timeline in which ChatGPT almost reaches AGI in 2027 and causes significant job losses (or at least shifts in the job market) via increased automation and productivity, but doesn't manage to fully replace humans.
In that timeline, I would expect AI Doomers to be even more frantic about basing their life decisions on short timelines. But as you point out, the question is transformative AI this *century*. In that case, I think that the "just a few more years" framework is pretty problematic. At some point, you have to make an arbitrary decision to just keep living your life, and I think that will only get harder as we get "closer" to AGI.
Otherwise, great post! I particularly like the framing of inaction as a bet too—it being the default makes it no less of a choice than action.
> Here, it might be helpful to sort the people who update (i.e. make serious life decisions) based on AI 2027 & co. into two categories: people involved in AI safety and/or governance, and everyone else.
I think this is on the right track, but cuts the joints a bit wrong. The categories that matter here are
- people involved in AI capabilities work, who should be acting so as to avoid killing everyone
- people involved in AI safety, who should be acting so as to be maximally persuasive to the prior group
- everybody else as individuals, who should probably hedge against mass unemployment to the extent they can but otherwise not worry about this stuff unless it entertains them
- everybody else as a mass, who should get really really angry at nothing in particular in the hopes it slows the badness down
(I, of course, am failing by all these standards; the fault was never in our stars)
Of course these are not hard distinctions - my point is just that supposing that AI safety people having true beliefs matters *by default* and not as a result of some exercise of power on their part is an extremely dangerous one. If you really take Yudkowsky-style catastrophism seriously - and as that framing suggests I largely don't, although I think "disempowerment" is more likely than not - then we're in all or nothing territory, right now, and nothing else matters. Whether p(doom) is 10% or 50% is irrelevant, because they imply the exact same course of action - you should be pushing whichever one is most persuasive in whatever your critical window is, even if that massively raises the risk you embarrass yourself afterwards.
To your final point, I actually think better capabilities modeling/forecasting ("improving epistemics") is itself extremely useful work and I was sad to see AI 2027 be the most thoroughgoing and energetically-promoted project of its kind.
Political will/buy in is maybe the biggest bottleneck to implementing existing policy ideas, many of which have been implemented in other places. In the UK, the CAIS letter was the main catalyst. Well-respected models can play the same role, but I think the software-only assumptions, over-reliance on naive extrapolation, and dramatic narrative packaging of AI 2027 made it a poor candidate for this.
To clarify, are you saying that the current tradeoff between verifying (via high-quality models) short timelines & acting on short timelines is suboptimal? As in, most of the people who might’ve been best positioned to do better modeling saw initial predictions/AI x-risk arguments and instead decided to jump right into object-level safety work -- but this might've been not ideal (potentially a coordination problem)?
I think that's right though it's a different frame than I put on it.
Good modeling is a multiplier on object level research because it convinces relevant external stakeholders to care about object level research and stakeholder-caring is the biggest bottleneck to that research having an impact.
So yes, even if timelines are in fact short, this calls for more (better) modelers.
Ah, got it. Idk if I'm being naive here, but for AI safety/gov specifically, is it actually true that good modeling effectively amplifies (rather than mostly clarifying the impact of) object-level research?
It seems like AI forecasts inherently can't achieve the rigor of typical policy-relevant models, e.g. climate or pandemic modeling. And this comment on my LW crosspost basically says that AI 2027 is already easily in the "top 1%" of emerging tech forecasts (the commenter's field), and that such models aren't actually valued for the conclusions themselves - https://www.lesswrong.com/posts/gdv93PboLFGHy8vPR/the-practical-value-of-flawed-models-a-response-to-titotal-s#comments
The example you bring up is the CAIS letter, which mostly seems to be evidence that elite endorsement is important, & doesn't seem to pertain as much to modeling. I certainly think good modeling would only increase stakeholder-caring, if anything, but would it do so effectively compared to other methods (especially considering the relatively high time cost)?
Yeah I mean amplification. Merely having the right idea/research means nothing if it doesn't change how relevant actors behave. Policy ideas need to be law to matter (in most cases). For that, your idea/problem needs to be perceived as important.
I think climate is the right analogy. The gears-level things you do to improve the climate were always clear (to varying degrees of specificity): throw money and talent at wind, solar, fusion, electrification. Two plus decades of work on "why climate change is real/important" is the main reason we spend ~10x more on obvious climate stuff now vs. 2005. I think that work was largely modeling climate impacts, collecting expert testimony/letters and doing attribution for scary weather events. It's true there were some useful breakthroughs on solar when there were way fewer resources, but batteries and maybe fusion later were/will be downstream of the funding boom.
Right now real AI safety is like $500m/year and virtually no one cares about the problem. I think marginal work towards 10xing those resources by getting people to care is more valuable than reallocating some small portion of them toward slightly better ideas. Publishing many good, diverse, stress-tested models worked well for climate on this score; it can work for us too.
More anecdata + agreement wrt robustness for careers:
The people I know who are making big career choices in the shadow of transformative AI seem to be making choices pretty robust to uncertainty. It's more "I had an AWS SWE offer right out of undergrad, but I decided to move to the Bay and do a mechinterp startup instead," than "I'm telling everyone I love to FUCK OFF and joining a cult."
Suppose it's 2040, and Nothing Ever Happened. The person who turned down the AWS offer because she wrongly believed Something Would Happen now has no retirement account and no job (the mechinterp startup is dead; there's nothing to interpret). Where does that leave her? In the same boat as 4 in 10 Americans[1], and probably with at least one economically-useful skill. (Startups will probably still exist in 2040). She's certainly counterfactually worse off than if she'd gone to Seattle instead of the Bay... but not so worse off that I'd be comfortable calling her plans "not robust to uncertainty."
If the pivoters I know are wrong, they don't enter Yann LeCun-world with a couple embarrassing-in-hindsight substack comments and zero career capital — they enter with a couple embarrassing-in-hindsight substack comments and pretty-good career capital. Maybe titotal and I (and you) just disagree about what "robust" means?
[1] https://news.gallup.com/poll/691202/percentage-americans-retirement-savings-account.aspx
Also, the diamondoid bacteria link is broken :(
+1 on all that, & yep maybe the meaning of "robust" here needs more clarification. Link should be fixed now!
> A few years of reduced savings or delayed family planning seems like a fair hedge against the real possibility of transformative AI this century.
I think my issue with this is the assumption that we'll actually know the state of play in a few years. Obviously if AGI happens and we hit superhuman coders in 2027, we'll know by 2027. But if we don't, there's no reason to assume that we won't be having the exact same debate in 2027. It might get even worse as we get closer to AGI; for example, we can imagine a timeline in which ChatGPT almost reaches AGI in 2027 and causes significant job losses (or at least shifts in the job market) via increased automation and productivity, but doesn't manage to fully replace humans.
In that timeline, I would expect AI Doomers to be even more frantic about basing their life decisions on short timelines. But as you point out, the question is transformative AI this *century*. In that case, I think that the "just a few more years" framework is pretty problematic. At some point, you have to make an arbitrary decision to just keep living your life, and I think that will only get harder as we get "closer" to AGI.
Otherwise, great post! I particularly like the framing of inaction as a bet too—it being the default makes it no less of a choice than action.
> Here, it might be helpful to sort the people who update (i.e. make serious life decisions) based on AI 2027 & co. into two categories: people involved in AI safety and/or governance, and everyone else.
I think this is on the right track, but cuts the joints a bit wrong. The categories that matter here are
- people involved in AI capabilities work, who should be acting so as to avoid killing everyone
- people involved in AI safety, who should be acting so as to be maximally persuasive to the prior group
- everybody else as individuals, who should probably hedge against mass unemployment to the extent they can but otherwise not worry about this stuff unless it entertains them
- everybody else as a mass, who should get really really angry at nothing in particular in the hopes it slows the badness down
(I, of course, am failing by all these standards; the fault was never in our stars)
Of course these are not hard distinctions - my point is just that supposing that AI safety people having true beliefs matters *by default* and not as a result of some exercise of power on their part is an extremely dangerous one. If you really take Yudkowsky-style catastrophism seriously - and as that framing suggests I largely don't, although I think "disempowerment" is more likely than not - then we're in all or nothing territory, right now, and nothing else matters. Whether p(doom) is 10% or 50% is irrelevant, because they imply the exact same course of action - you should be pushing whichever one is most persuasive in whatever your critical window is, even if that massively raises the risk you embarrass yourself afterwards.
To your final point, I actually think better capabilities modeling/forecasting ("improving epistemics") is itself extremely useful work and I was sad to see AI 2027 be the most thoroughgoing and energetically-promoted project of its kind.
Political will/buy in is maybe the biggest bottleneck to implementing existing policy ideas, many of which have been implemented in other places. In the UK, the CAIS letter was the main catalyst. Well-respected models can play the same role, but I think the software-only assumptions, over-reliance on naive extrapolation, and dramatic narrative packaging of AI 2027 made it a poor candidate for this.
To clarify, are you saying that the current tradeoff between verifying (via high-quality models) short timelines & acting on short timelines is suboptimal? As in, most of the people who might’ve been best positioned to do better modeling saw initial predictions/AI x-risk arguments and instead decided to jump right into object-level safety work -- but this might've been not ideal (potentially a coordination problem)?
I think that's right though it's a different frame than I put on it.
Good modeling is a multiplier on object level research because it convinces relevant external stakeholders to care about object level research and stakeholder-caring is the biggest bottleneck to that research having an impact.
So yes, even if timelines are in fact short, this calls for more (better) modelers.
Ah, got it. Idk if I'm being naive here, but for AI safety/gov specifically, is it actually true that good modeling effectively amplifies (rather than mostly clarifying the impact of) object-level research?
It seems like AI forecasts inherently can't achieve the rigor of typical policy-relevant models, e.g. climate or pandemic modeling. And this comment on my LW crosspost basically says that AI 2027 is already easily in the "top 1%" of emerging tech forecasts (the commenter's field), and that such models aren't actually valued for the conclusions themselves - https://www.lesswrong.com/posts/gdv93PboLFGHy8vPR/the-practical-value-of-flawed-models-a-response-to-titotal-s#comments
The example you bring up is the CAIS letter, which mostly seems to be evidence that elite endorsement is important, & doesn't seem to pertain as much to modeling. I certainly think good modeling would only increase stakeholder-caring, if anything, but would it do so effectively compared to other methods (especially considering the relatively high time cost)?
Yeah I mean amplification. Merely having the right idea/research means nothing if it doesn't change how relevant actors behave. Policy ideas need to be law to matter (in most cases). For that, your idea/problem needs to be perceived as important.
I think climate is the right analogy. The gears-level things you do to improve the climate were always clear (to varying degrees of specificity): throw money and talent at wind, solar, fusion, electrification. Two plus decades of work on "why climate change is real/important" is the main reason we spend ~10x more on obvious climate stuff now vs. 2005. I think that work was largely modeling climate impacts, collecting expert testimony/letters and doing attribution for scary weather events. It's true there were some useful breakthroughs on solar when there were way fewer resources, but batteries and maybe fusion later were/will be downstream of the funding boom.
Right now real AI safety is like $500m/year and virtually no one cares about the problem. I think marginal work towards 10xing those resources by getting people to care is more valuable than reallocating some small portion of them toward slightly better ideas. Publishing many good, diverse, stress-tested models worked well for climate on this score; it can work for us too.