OpenAI Unveils Its Latest Approach to Ensuring AI Safety

by Agne Cimerman

Published: April 06, 2023 at 2:00 pm Updated: June 14, 2024 at 10:11 am

by Victor Dey

Edited and fact-checked: April 06, 2023 at 2:00 pm

In Brief

OpenAI has released a blog post outlining its improved approach to safety following recent concerns regarding safety and privacy and investigations.

The company pledges to conduct rigorous testing, engage external experts for feedback, and work with governments to determine the best approach for AI regulations.

OpenAI Unveils Its Latest Approach to Ensuring AI Safety

After facing concerns regarding safety and privacy and following recent investigations in some European countries, OpenAI has released a blog post outlining the company’s improved approach to safety.

OpenAI pledges to conduct rigorous testing, engage external experts for feedback before releasing any new system, and work with governments to determine the best approach for AI regulations.

Previously, the company spent over six months working on the safety and alignment of its latest model, GPT-4, before releasing it publicly. To ensure the safety of its newest language model, GPT-4, OpenAI hired a team of over 50 experts: AI safety, ethics, and policy specialists, including researchers and engineers.

“Crucially, we believe that society must have time to update and adjust to increasingly capable AI, and that everyone who is affected by this technology should have a significant say in how AI develops further,”
OpenAI wrote.

OpenAI’s Focus on Children’s Safety and Privacy

Italy banned ChatGPT, citing OpenAI’s failure to verify the age of its users, despite being designed for individuals aged 13 and over, as one of the reasons. A critical focus of the company’s safety efforts has shifted to protecting children by implementing age verification options, which OpenAI is now exploring, especially since the AI tools are intended for individuals aged 18 or older or 13 and older with parental approval.

The company strictly prohibits the generation of hateful, harassing, violent or adult content, and GPT-4 already has an 82% lower likelihood of responding to requests for disallowed content compared to GPT-3.5.

OpenAI has established a robust system to monitor for abuse and hopes to make GPT-4 available to more people over time. The company works with developers on tailored safety mitigations, such as the non-profit Khan Academy, and is working on features to allow for stricter standards for model outputs.

Improving privacy is another safety aspect OpenAI is focusing on, especially after the recent data breaches. The AI company’s large language models are trained on an enormous amount of text that includes available-to-public content, licensed content, and content generated by human reviewers; it uses data to make AI models more helpful to people.

OpenAI is taking steps to protect the privacy of individuals by removing personal information from its training data, fine-tuning models to reject requests for personal information, and deleting personal information from its systems upon request. This reduces the chance of AI models generating responses that include personal information.

OpenAI’s approach to addressing AI safety concerns is to research effective mitigations and alignment techniques and test them against real-world abuse. It also pledges to be increasingly cautious in creating and deploying more capable models and to enhance safety precautions. Effective governance of AI development and deployment is essential to ensure safety, which requires technical and institutional innovation.

Tags:

Disclaimer

In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.

About The Author

Agne is a journalist who covers the latest trends and developments in the metaverse, AI, and Web3 industries for the Metaverse Post. Her passion for storytelling has led her to conduct numerous interviews with experts in these fields, always seeking to uncover exciting and engaging stories. Agne holds a Bachelor’s degree in literature and has an extensive background in writing about a wide range of topics including travel, art, and culture. She has also volunteered as an editor for the animal rights organization, where she helped raise awareness about animal welfare issues. Contact her on [email protected].

Agne Cimerman