168 - How to Solve AI Alignment with Paul Christiano

Paul previously ran the language model alignment team at OpenAI, the creators of ChatGPT.
Today, we’re hoping to explore the solution-landscape to the AI Alignment problem, and hoping Paul can guide us on that journey.
In today’s episode, Paul answers many questions, but the overarching ones are:
1) How BIG is the AI Alignment problem?
2) How HARD is the AI Alighment problem?
3) How SOLVABLE is the AI Alignment problem?
Does humanity have a chance? Tune in to hear Paul’s thoughts.
TIMESTAMPS
0:00 Intro
9:20 Percentage Likelihood of Death by AI
11:24 Timing
19:15 Chimps to Human Jump
21:55 Thoughts on ChatGPT
27:51 LLMs & AGI
32:49 Time to React?
38:29 AI Takeover
41:51 AI Agency
49:35 Loopholes
51:14 Training AIs to Be Honest
58:00 Psychology
59:36 How Solvable Is the AI Alignment Problem?
1:03:48 The Technical Solutions (Scalable Oversight)
1:16:14 Training AIs to be Bad?!
1:18:22 More Solutions
1:21:36 Stabby AIs
1:26:03 Public vs. Private (Lab) AIs
1:28:31 Inside Neural Nets
1:32:11 4th Solution
1:35:00 Manpower & Funding
1:38:15 Pause AI?
1:43:29 Resources & Education on AI Safety
1:46:13 Talent
1:49:00 Paul’s Day Job
1:50:15 Nobel Prize
1:52:35 Treating AIs with Respect
1:53:41 Uptopia Scenario
1:55:50 Closing & Disclaimers
RESOURCES
Alignment Research Center
https://www.alignment.org/
Paul Christiano’s Website
https://paulfchristiano.com/ai/