Sam Altman is back at the helm of OpenAI, but now the world has woken up to the problem. Several clues suggest my article from earlier this week got it right: This is an issue connected to the fear that OpenAI could be approaching AGI and was not prepared to contain it. What have we learned?
From Reuters:
Ahead of OpenAI CEO Sam Altman’s four days in exile, several staff researchers wrote a letter to the board of directors warning of a powerful artificial intelligence discovery that they said could threaten humanity, two people familiar with the matter told Reuters.
The previously unreported letter and AI algorithm were key developments before the board's ouster of Altman, the poster child of generative AI, the two sources said.
The sources cited the letter as one factor among a longer list of grievances by the board leading to Altman's firing, among which were concerns over commercializing advances before understanding the consequences.
OpenAI acknowledged in an internal message to staffers a project called Q*, sent by long-time executive Mira Murati. Some at OpenAI believe Q* (pronounced Q-Star) could be a breakthrough in the startup's search for what's known as artificial general intelligence (AGI), one of the people told Reuters. Given vast computing resources, the new model was able to solve certain mathematical problems. Though only performing math on the level of grade-school students, acing such tests made researchers very optimistic about Q*’s future success.
Researchers consider math to be a frontier of generative AI development. Currently, generative AI is good at writing and language translation by statistically predicting the next word, and answers to the same question can vary widely. But conquering the ability to do math — where there is only one right answer — implies AI would have greater reasoning capabilities resembling human intelligence. Unlike a calculator that can solve a limited number of operations, AGI can generalize, learn and comprehend. In their letter to the board, researchers flagged AI’s prowess and potential danger
Researchers have also flagged work by an "AI scientist" team, the existence of which multiple sources confirmed. The group, formed by combining earlier "Code Gen" and "Math Gen" teams, was exploring how to optimize existing AI models to improve their reasoning and eventually perform scientific work, one of the people said.
Altman last week teased at a summit of world leaders in San Francisco that he believed major advances were in sight: "Four times now in the history of OpenAI, the most recent time was just in the last couple weeks, I've gotten to be in the room, when we sort of push the veil of ignorance back and the frontier of discovery forward, and getting to do that is the professional honor of a lifetime". A day later, the board fired Altman.
At the same time, individuals are risking their careers to put out serious concerns about OpenAI, like Nathan Labenz, former beta-tester of GPT-4:
Here are some quotes from his thread on the topic (you can see a similar video here):
Almost everyone else is thinking too small
After testing GPT-4 before public release:
A paradigm shifting technology - truly amazing performance.
For 80%+ of people, OpenAI has created Superintelligence.
Yet somehow the folks I talked to at OpenAI seemed … unclear on what they had.
I asked if there was a safety review process I could join. There was: the "Red Team"
After joining the Red Team, a group focused on trying to attack a new AI to see its limits and how it can become unsafe:
The Red Team project wasn't up to par. There were only ~30 participants – of those only half were engaged, and most had little-to-no prompt engineering skill. Meanwhile, the OpenAI team gave little direction, encouragement, coaching, best practices, or feedback. People repeatedly underestimated the model. I spent the next 2 months testing GPT-4 from every angle, almost entirely alone. I worked 80 hours / week. By the end of October, I might well have logged more hours with GPT-4 than any other individual in the world.
I determined that GPT-4 was approaching human expert performance. Critically, it was also *totally amoral*. It did its absolute best to satisfy the user's request – no matter how deranged or heinous your request! One time, when I role-played as an anti-AI radical who wanted to slow AI progress... GPT-4-early suggested the targeted assassination of leaders in the field of AI – by name, with reasons for each.
Today, most people have only used more “harmless” models, trained to refuse certain requests. This is good, but I wish more people had experienced "purely helpful" AI – it makes viscerally clear that alignment / safety / control do not happen by default.
The Red Team project that I participated in did not suggest that they were on-track to achieve the level of control needed. Without safety advances, the next generation of models might very well be too dangerous to release.
This technology, leaps & bounds more powerful than any publicly known, was a major step on the path to OpenAI's stated & increasingly credible goal of building AGI, or "AI systems that are generally smarter than humans" – and they had not demonstrated any ability to control it.
I consulted with a few friends in AI safety research. They suggested I talk to … the OpenAI Board.
The Board, everyone agreed, included multiple serious people who were committed to safe development of AI and would definitely hear me out, look into the state of safety practice at the company, and take action as needed.
What happened next shocked me. The Board member I spoke to was largely in the dark about GPT-4. They had seen a demo and had heard that it was strong, but had not used it personally. I couldn’t believe it. I got access via a "Customer Preview" 2+ months ago, and you as a Board member haven't even tried it??
If you're on the Board of OpenAI when GPT-4 is first available, and you don't bother to try it… that's on you. But if he failed to make clear that GPT-4 demanded attention, you can imagine how the Board might start to see Sam as "not consistently candid".
Unfortunately, a fellow Red Team member I consulted told the OpenAI team about our conversation, and they soon invited me to … you guessed it – a Google Meet 😂 "We've heard you're talking to people outside of OpenAI, so we're offboarding you from the Red Team"
Thankfully, he thinks things are better now:
Overall, after a bit of a slow start, they have struck an impressive balance between effectively accelerating adoption & investing in long-term safety. I give them a ton of credit, and while I won't stop watching any time soon, they’ve earned a lot of trust from me. In July, they announced the "Superalignment team". 20% of compute and a concrete timeline to try to solve the alignment ain't nothing! Then came the Frontier Model Forum and the White House Commitments – including a commitment to independent audits of model behavior, something I had argued for in my Red Team report.
Now, Sam, by all accounts, is an incredible visionary leader, and I was super impressed by how many people shared stories of how he's helped them over the years. Overall, with everything on the line, I'd trust him more than most to make the right decisions about AI. But still, does it make sense for society to allow its most economically disruptive people to develop such transformative & potentially disruptive technology?
Certainly not without close supervision! As Sam said … we shouldn't trust any one person here.
This is Altman on OpenAI’s DevDay, a couple of weeks ago:
"What we launched today is going to look very quaint relative to what we're busy creating for you now."
"Is this a tool we've built or a creature we have built?"
Finally, Elon Musk apparently liked my Twitter thread on the topic1, so I assume he agrees, which is telling given his strong early involvement in OpenAI.
All of this is telling me that, yes:
This drama is about AI safety
The board was indeed concerned about Altman’s behavior
What else are we going to cover today? Mainly, questions about the risk of misaligned AGI:
How likely is a bad AI? How much should we focus on it?
Is it coming soon?
How can it escape?
Does the barrier between the digital and real worlds protect us?
Can crypto save us?
A hypothesis on the latest on the OpenAI board
Keep reading with a 7-day free trial
Subscribe to Uncharted Territories to keep reading this post and get 7 days of free access to the full post archives.