We don’t know how to contain or align a FOOMing AGI
Tomas and friends:
I usually avoid shooting off my mouth on social media, but I’m a BIG fan of Tomas and this is one of the first times I think he’s way off base.
Everyone needs to take a breath. The AI apocalypse isn’t nigh!
Who am I? I’ve watched this movie from the beginning, not to mention participated in it. I got my PhD in AI in 1979 specializing in NLP, worked at Stanford then co-founded four Silicon Valley startups, two of which went public, in a 35-year career as a tech entrepreneur. I’ve invented several technologies, some involving AI, that you are likely using regularly if not everyday. I’ve published three award-winning or best-selling books, two on AI. Currently I teach “Social and Economic Impact of AI” in Stanford’s Computer Science Dept. (FYI Tomas’ analysis of the effects of automation – which is what AI really is – is hands down the best I’ve ever seen, and I am assigning it as reading in my course.)
May I add an even more shameless self-promotional note? My latest book, “Generative Artificial Intelligence: What Everyone Needs to Know” will be published by Oxford University Press in Feb, and is available for pre-order on Amazon: https://www.amazon.com/Generative-Artificial-Intelligence-Everyone-KnowRG/dp/0197773540. (If it’s not appropriate to post this link here, please let me know and I’ll be happy to remove.)
The concern about FOOM is way overblown. It has a long and undistinguished history in AI, the media, and (understandably so) in entertainment – which unfortunately Tomas sites in this post.
The root of this is a mystical, techno-religious idea that we are, as Elon Musk erroneously put it, “summoning the beast”. Every time there is an advance in AI, this school of thought (superintelligence, singularity, transhumanism, etc.) raises it head and gets way much more attention than it deserves. For a bit dated, but great deep-dive on this check out the religious-studies scholar Robert Geraci’s book “Apocalyptic AI: Visions of Heaven in Robotics, Artificial Intelligence, and Virtual Reality”.
AI is automation, pure and simple. It’s a tool that people can and will use to pursue their own goals, “good” or “bad”. It’s not going to suddenly wake up, realize it’s being exploited, inexplicably grow its own goals, take over the world, possibly wipe out humanity. We don’t need to worry about it drinking all our fine wine and marrying our children. These anthropomorphic fears are fantasy. It is based on a misunderstanding of “intelligence” (that it’s linear and unbounded), that “self-improvement” can runaway (as opposed to being asymptotic), that we’re dumb enough to build unsafe systems and hook them up to the means to cause a lot of damage (which, arguably, describes current self-driving cars). As someone who has built numerous tech products, I can assure you that it would take a Manhattan Project to build an AI system that can wipe out humanity, and I doubt it could succeed. Even so, we would have plenty of warning and numerous ways to mitigate the risks.
This is not to say that we can’t build dangerous tools, and I support sane efforts to monitor and regulate how and when AI is used, but the rest is pure fantasy. “They” are not coming for “us”, because there is no “they”. If AI does a lot of damage, that's on us, not "them".
There’s a ton to say about this, but just to pick one detail from the post, the idea that an AI system will somehow override it’s assigned goals is illogical. It would have to be designed to do this (not impossible…but if so that’s the assigned goal).
There are much greater real threats to worry about. For instance, that someone will use gene-splicing tech to make a highly lethal virus that runs rampant before we can stop it. Nuclear catastrophe. Climate change. All these things are verifiable risks, not a series of hypotheticals and hand-waving piled on top of each other. Tomas could write just as credible a post on aliens landing.
What’s new is that with Generative AI in general, and Large Language Models in particular, we’ve discovered something really important – that sufficiently detailed syntactic analysis can approximate semantics. LLMs are exquisitely sophisticated linguistic engines, and will have many, many valuable applications – hopefully mostly positive – that will improve human productivity, creativity, and science. It’s not “AGI” in the sense used in this post, and there’s a lot of reasonable analysis that it’s not on the path to this sort of superintelligence (see Gary Marcus here on Substack, for instance).
The recent upheaval at OpenAI isn’t some sort of struggle between evil corporations against righteous superheroes. It’s a predictable (and predicted!) consequence of poorly architected corporate governance, and inexperienced management and directors. I’ve had plenty of run-ins with Microsoft, but they aren’t going to unleash dangerous and liability-inducing products onto a hapless, innocent world. They are far better stewards of this technology than many nations. I expect this awkward kerfuffle to blow over quickly, especially because the key players aren't going anywhere, their just changing cubicles.
Focusing on AI as an existential threat risks drowning out the things we really need to pay attention to, like accelerating disinformation, algorithmic bias, so-called prompt hacking, etc. Unfortunately, it’s a lot easer to get attention screaming about the end of the world than calmly explaining that like every new technology, there are risks and benefits.
It’s great that we’re talking about all this, but for God sake please calm down! 😉
Anthropomorphising AI is a big fail. Once you use that lens you’re really skewed with human values that are often entirely misguided in the perception of machine intelligence. Much of the alignment problem is actually in the category of humans acting badly. No decisions should be made using emotional underpinning. If there is one thing we should know as a student of history it’s that human corse visceral emotions, badly channeled, are a horrible guideline in the aggregate.
I realise that you’re using the acronym FOOM to describe recursive self-improvement. After having researched it I don’t think that’s a widely used acronym in the AI realm. Although I did find 28 references to it, none of that was in the machine learning/AI recursive self-improvement category. So if you’re going to use an acronym it’s very helpful to actually define it.
I’ve been reading Eliezer Yudkowski for years and years. He has a brilliant intellect and he is able to see problems that may be real. And yet, there are a few things about him that have molded his personality and amplified his deeply fearful personality. One is the fact that when he was very young his brother died, and this deeply traumatised him has compelled him to have a huge and abiding fear of death ever since. That fear is really very disproportionate. Now when I hear or read him I find his approach to be quite shrill. I suspect he’s the type of guy that would fear that the I Love Lucy broadcasts into interstellar space are going to bring alien invasions to kill us. He has a gift to twist almost any AI development into a doomer outcome. There’s no denying the prospect of that could happen. Are his fears realistic? I think that depends on how far up in the world of fear you want to go. Many people are afraid of their own shadows. Alignment is certainly a problem, but there are tendencies to amplify human fears out of all proportion.
AGI will certainly evolve on a spectrum. It may soon slip into every crack and crevice of our infrastructure so that we can’t dislodge it. Yet there are no reasons to believe will be malevolent towards humans anymore than we have it out for squirrels. Alignment will be a concerted effort nonetheless. If you’re truly worried about AGI supplanting humanity people may be inclined to use a parallel of how we supplanted the Neanderthal, then I think we should look at it through a lens of evolution. Humans have cultivated a deep fear because they know how truly unjust they can be to everyone outside of their own tribal community, and occasionally horrific inside of their tribes. The thought of AGI emulating humans who are tremendous shits is cause for not having them emulate the traits of “human nature”. Just ask the Native Americans or any animal and species. Humans fear out of evolutionary pressures, yet AI/AGI has no evolutionary origins that would be of interest in contests of tooth and claw, such as what shapes primitive human instincts. It turns out that Humans are almost always the real monsters. The human alignment concern should be of equal importance.
I suspect the golden age will be proportional to how integrated humans and humanity is with AI. The limit case is to cross into the Transhuman thresholds. After that who knows how events will be crafted, and in the scope of what composite set of agendas. This is all evolutionary.
Ultimately evolution will be unconstrained at scale. Realize that humanity is, in the big picture, just a boot-up species for Super-intelligence. No matter how advanced individual transhumans may become, humans that are not augmented will become like the dinosaurs or the Intel 386 chips of our era. Yet we will have accomplished a great purpose.
Excellent as always, Tomas. One comment that I believe many are missing... I'm not confident that a singular AGI/ASI would be the actual threat vector here. To GingerHipster's comment, it seems much more likely that a mass of less general/more specific GPTs/agents with various agendas and objectives could cause problems much, much sooner.
That is, it's not about one brilliant AI but rather a global network of connected AIs processing at speeds we cannot comprehend. Have you read "Stealing Worlds" by Karl Schroeder? If not, worth a peruse for a future vision of technology encompassing self-sovereign identity, blockchains, Network States, AR/VR and AI that is compelling.
Great article as usual Tomas.
In this tweet https://twitter.com/balajis/status/1726171850777170015 Balaji makes these arguments, (among others that I don't mention because I think they are weaker) :
- Just the fact that there is China means at least two AGIs, not one (so, the implicit argument is that any voluntary slowdown in the development of an American/Western AGI will give a Chinese AGI a greater chance of taking the lead).
- There is enough friction in the real, physical world, to prevent the first superintelligence from instantly taking over the world before there are others to counterbalance it
- (from this tweet https://twitter.com/balajis/status/1726933398374199425) The best way forward to try to solve the AI alignment model is to have multiple open sources AGIs competing against each other, while also cooperating
- By the time an AGI comes we’ll probably have private key control for everything, and those are cryptographically difficult for even an AI to break into (implicit argument: we will have encryption algorithms resistant to quantum computers, which will be difficult even for an AGI to break)
What do you think of these arguments?
Total Kool aid drinking waffle. Yes, an actual, real AGI would be all of those things. But that is not even remotely what Open AI or any of these other companies have. All they have is more advanced versions of something we've had for years; predictive text chat bots. Oh and a WHOLE lot of hype and marketing. They are just word calculators. There is nothing behind the curtain. Hell even potato head Elon Must managed to get one up and running with just 4 months of work (not by him of course!) - that should show you how this is nothing more than a scam. Yes, real, actual AGI will be very scary and will need to be managed very carefully, but this is not remotely it and they are nowhere even close, no matter what they tell us to boost their stock price!
Thank you for the dive into AGI and what could go wrong. It's refreshing after days of people cheerleading Altman and railing on the OpenAI board — it's almost as if people don't care about humanity any more.
The most challenging part about OpenAI and AGI is separating hype and reality. After months of Altman parading around his blue backpack with a "kill switch" in case of immigrant sentience, it became clear this was a distraction tactic to prevent regulation that would slow down growth. I've also hear reports that the current LLM growth curve is leveling out, which is the opposite of FOOMing. At the same time, my old boss, Blaise Agüera y Arcas (one of the heads of AI at Google), argues that AGI is already here (https://www.noemamag.com/artificial-general-intelligence-is-already-here/).
My greatest concern is that success criteria for AI is embedded from our collective history of praising ruthless domination and winner-takes-all behavior.
> a Google AI engineer (the type of person you’d think are mindful of this type of problem), working on a more basic LLM
I feel that (even with the link) this part is misleading. He was a crazy guy who was precisely hired for being different and who had grifted his way into many jobs like them. I don't think he wrote any code that went into the LLM or understood it in any technical way, he was merely an evaluator. You might want to check more technical contemporary news sources for a better analysis of his credentials.
This is the best, most complete explanation of what AI is and the horrific consequences if/when somebody does it wrong and it goes full AGI. I had no understanding of what this was. Thank you ( but maybe I should have stayed in a blissful state of ignorance).
Thank you for the article Tomas.
There are a few themes I would like to read more about.
1. Experts anthropomorphizing AI in the other direction by suggesting that AI needs to be like human intelligence to be successful. Human's truly have nothing to compare their intelligence to which is other-worldly ... and now there is AI. Why exactly does AI need to be recognizable, to look like human intelligence?
2. How do humans handle chaos? We build rules. Rules actually look more like guidelines or guardrails. We do not stop our three and four year old children from out and out lying because we can't, understanding that lying is a tool to be used delicately with the utmost consideration. It takes practice to do well. It will likely leave some wreckage over a lifetime. AI will mirror some of these human mastery challenges precisely because it must master a human language early on the road to autonomy. There will be other signposts of sophistication.
3. There is something (haha, probably much more than what is known) science has yet to unravel about how nature works. Whether living or not living, organic or inorganic it seems change exhibits a progression that is not reversible. The entities that live on the Earth are one with the Earth and so this progress naturally extends to entities which aren't generally considered living, in the human sense of the word. Have we somehow engineered an entity that exists outside of that reality? Isn't AI going to assume its natural place in that progression? If so then maybe proper guardrails will be enough because that is really all we can do. We are certainly incapable of halting progress.
Thank you for your consideration Tomas.
This is a good overview of why we need to take X-risk from artificial intelligence seriously, and I agree with Tomas that many dismiss it without really engaging with the best arguments (from people like Bostrom and Yudkowski et al). I wrote about this on Sképsis earlier this year: https://philipskogsberg.substack.com/p/the-genie-in-the-bottle-an-introduction
Having said that I can't really convince myself to worry too much about AI wiping out all life/human life. Perhaps this is a failure of imagination on my part, but I find the following two arguments against X-risk hypothesis the most persuasive:
1. We won’t create true AGI: Advanced, power-seeking AGIs with strong strategic awareness and reasoning may pose existential risks that are more or less impossible to fully pre-empt and counter-act. But it's either impossible to create such AGIs or such a distant possibility that we can’t do anything about it anyway.
2. If we create AGI it will not be power-seeking: Even though power-seeking and super-optimizing advanced AIs could pose an existential threat in theory, we shouldn't assume that the kind of AGI we will create will be power-seeking in the way that humans are power-seeking.
From this article: https://open.substack.com/pub/philipskogsberg/p/why-ai-probably-wont-kill-us-all
As Tomas pointed out in one of the comments, deductive arguments lead us to the conclusion that a true AGI with power-seeking behaviors will almost certainly lead to the extinction of humanity. But inductive arguments moderate that conclusion a lot, and makes it much less likely. Of course, even a very small chance of a very bad event should be taken seriously, the question is how far we should go in preventing it? My personal opinion so far is that if anything, we should continue developing AI and AI frameworks and regulate it lightly and only when clear problems have been demonstrated. Stopping the development of an AGI or regulating the industry to death is an approach that will cost us much more, and will not nevessarily prevent X-risk anyway.
A few observations, first I think Mr. Pueyo’s article may be the best overview of the AI story I have read thus far. I appreciate that the article gives a novice, like myself, plenty of history of the development of the technology. As well as the context for what is playing out, in real time, with the different theories and opinions regarding the dangers AI may represent.
Second, I really appreciated the author’s use of example scenarios to illustrate how these dangers may manifest or what the motivations may be that would bring them to pass.
The short stock trading scenario seems both humorous and scary at the same time. But very plausible.
Last, I find I am left wondering, if the author’s concern for a real investigation into what occurred in the Open AI shuffle will ever actually come to pass. Obviously the article was written prior to the determination to reinstate Sam Altman and oust the four board members who ousted him. So any future investigation will be Sam Altman investigating himself?
Bloomberg observers are speculating that the new “board” will now be a collection of high level Tech Industry business people, perhaps even Nadel himself. Whose motivations will be far more aligned with the commercialization of the technology for profit rather than fortifying the guard rails to protect humanity from itself.
I know it seems over dramatic to say, but you must wonder if this will wind up being the “Terminator moment”. That point we all look back to, as the world crumbles around us and the machines take over. Where we say, “yup, that was the turning point. If only we had heeded the warnings.”
I must admit that I found it more than a little disturbing that Elon Musk, who, I understand, was a member of the founding governance board and early investor, but chose to leave. When Elon expresses his “concern” over an event like this, I think that we all probably should do so as well.
I did mention that this last bit was likely going to be overly dramatic? Didn’t I?
Tomas. I do not think I can recall a more lively discussion as this one and it speaks volumes to your ability to delve into a developing story of this complexity and then make a right turn and explain for us readers something completely different in another sphere. I find it remarkable.
My only comment that I would add to the narrative above regards the comment that we should slow AI down. So many others have made the same argument, but I am at a loss as to how that would work. It seems to me that there is no way to slow down the work being done on AI and LLM's. Will China or Russia, Iran or North Korea "slow down"? We in the West can only hope that we are running in place with the rest of the world. If by some miracle, we are ahead or perhaps we remain ahead, great, but slowing down seems to be a fools errand.
Britain, then America later, had the largest naval fleet in the world... until they didn't. Now, it's China.
There is no slowing down when, for good or bad, you are in the race.
Don't know any of the principals in this analysis, but it does generate some thoughts....
The golem could be helpful for its rabbi maker, or become destructive if/when suitable controls were not deployed, such as removing the aleph from its forehead before the sabbath.
A mouse apprentice to a wizard conjured up a self-actuating broom to carry out cleaning tasks he disdained. Things got a bit out of hand until the wizard returned....
Lots of intelligent people (not the same as wise) have undertaken the creation of a great force to benefit mankind, unless it becomes mal-aligned.
Many fervent people can't wait to leave this troubled sphere for the greater glories of whatever heaven they trust is there for them, and may even welcome its hastening by a god-like force.
Great essay and will be sharing this to non-tech friends as an intro to AI safety
> A bunch of OpenAI employees left the company several months ago to build Anthropic because they thought OpenAI was not focused enough on alignment.
Should it be a "several years ago"?
Don't you think even without achieving AGI, we will have such overwhelming advancements in AI enhanced weaponry, lethal autonomous weapons & drone swarms etc, that we have enough to tilt the scales where even nuclear powers (& their sacred assets) could be challenged & wreaked havoc upon by the weaker states or ideologically driven entities (terrorists) in the battlefields of the future ? Could this itself be the biggest threat if the nuclear powers are then compelled to unleash the mighty to subdue the improvisers banking on scale & speed over raw power ?