Content
summary Summary

A new OpenAI team is tackling the challenge of superintelligence alignment to ensure that future AI systems, which are much smarter than humans, follow human intent.

The team, co-led by Ilya Sutskever and Jan Leike, is dedicated to finding scientific and technical breakthroughs to ensure the safe control of AI systems, which could bring unprecedented progress but also prove dangerous by potentially causing unintended consequences for humanity.

The ambitious goal of this new "Superalignment" team is to create "the first automated alignment researcher" with human-level capabilities. The team expects to "iteratively align superintelligence" using "vast amounts of compute" and, in just four years, solve the core technical challenges of superintelligence alignment. OpenAI is dedicating 20% of today's secured computing power to this goal.

Superintelligence alignment is fundamentally a machine learning problem, and we think great machine learning experts—even if they’re not already working on alignment—will be critical to solving it.

OpenAI

Recently, there has been growing criticism that the dystopias of extinction by a super-AI are designed to distract from the current dangers of AI.

Ad
Ad

"An incredibly ambitious goal"

To achieve this "incredibly ambitious goal," the team plans to develop a scalable training method, validate the resulting model, and stress test their alignment pipeline.

They plan to focus on scalable monitoring and generalization, which can help provide a training signal for tasks that are difficult for humans to evaluate. In addition, they plan to automate the search for problematic behavior and problematic internal processes to validate system alignment, and to evaluate the entire pipeline using adversarial testing.

While acknowledging that their research priorities may change, the team intends to learn more about the problem and potentially incorporate new areas of research into their approach. OpenAI promises to "share the fruits of this effort broadly" and is looking for researchers and engineers to join the effort.

The new team's work will complement ongoing projects at OpenAI aimed at improving the safety of current models and understanding and mitigating other AI-related risks, such as misuse, economic disruption, disinformation, bias and discrimination, and addiction and dependency.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • OpenAI is launching a new team to work on the challenge of superintelligence alignment, led by Ilya Sutskever and Jan Leike.
  • Their goal is to create an automated alignment researcher with human-level capabilities and solve the core technical challenges of superintelligence alignment in four years.
  • OpenAI is dedicating 20% of its secured computing power to this effort over the next four years and is seeking outstanding researchers and engineers to join the team.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.