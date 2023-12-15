OpenAI’s Superalignment Team Unveils Innovative Method for AI System Oversight

The artificial intelligence (AI) research organization OpenAI’s Superalignment team published a new research paper introducing the concept of utilizing smaller AI models to supervise more advanced ones. This approach demonstrates impressive outcomes, promising a potential simplification of supervising highly advanced AI systems for people in the future.

In conventional machine learning, humans oversee AI systems less intelligent than themselves. However, with superintelligent AI, humans will need to supervise systems more intelligent than themselves.

In the pursuit of guiding and controlling superintelligent AI systems, researchers proposed a method where a less powerful AI model could supervise a more powerful one.

Ordinarily, it is unlikely for a strong model to outperform a weak supervisor, but strong pretrained models possess excellent capabilities. Researchers applied this approach by supervising GPT-4 with a GPT-2-level model. The resulting model is typically performed between GPT-3 and GPT-3.5, indicating it can achieve strong AI capabilities with weaker supervision.

Overall, the results suggest that while basic human supervision might not scale well to superintelligent models, there are ways to significantly enhance how well these models learn from less capable supervisors.

The researchers believe their approach captures essential difficulties in aligning future superhuman models, enabling them to make progress.

Alongside announcing research findings, OpenAI encouraged students, academics and other researchers to contribute to the broad field of superhuman AI alignment. The company launched a $10 million grant program focused on this problem.

OpenAI Advances Evolving Landscape of AI

According to OpenAI, superintelligent AI models exceeding human capabilities might be developed within the next decade. However, managing and directing these sophisticated AI systems pose a considerable challenge for people.

Existing alignment methods, such as reinforcement learning from human feedback (RLHF), depend on human guidance. However, future AI systems will be capable of extremely complex and creative behaviors that will make it hard for humans to reliably supervise them. Thus, the company is actively exploring innovative solutions to address the evolving landscape of AI development.

Recently, OpenAI allegedly announced the release of the company’s latest development – the GPT-4.5 model. The alleged screenshot suggested advanced capabilities encompassing language, audio, vision, video, 3D, complex reasoning, and cross-model understanding. However, Sam Altman, the founder of OpenAI, denied the rumors.

