OpenAI Assembles a Team of 50+ Experts to Enhance GPT-4’s Safety

by Agne Cimerman

Published: March 15, 2023 at 3:30 pm Updated: March 15, 2023 at 3:30 pm

In Brief

OpenAI has hired a team of over 50 experts to ensure that its newest language model, GPT-4, is safe for use.

The team includes researchers and engineers specializing in AI safety, ethics, and policy.

The aim is to prevent GPT-4 from generating harmful or biased content and to ensure that it aligns with human values.

OpenAI Assembles a Team of 50+ Experts to Enhance GPT-4’s Safety

OpenAI has hired over 50 experts from various domains to make GPT-4 safer. The experts have been working with adversarial testing of the model to identify potential risks and vulnerabilities. They are experts from various areas: long-term AI alignment risks, cybersecurity, biorisk, and international security. Their findings have helped OpenAI evaluate model behavior in high-risk areas that require niche expertise.

While the newest language model poses similar risks as smaller language models, the additional capabilities of GPT-4 lead to new threats. Thus, the engagement of experts has been crucial in ensuring the technology’s safety.

OpenAI has implemented an additional set of safety-relevant reinforcement learning from human preferences (RLHF) training prompts and rule-based reward models (RBRMs) to improve the safety of the GPT-4 model. The RBRMs are zero-shot GPT-4 classifiers that function as an extra reward signal for the GPT-4 policy model during RLHF fine-tuning. Their purpose is to incentivize appropriate behavior, such as declining to generate harmful content or not rejecting harmless requests.

To ensure the safety of the GPT-4 models, OpenAI began recruiting external experts in August 2022 to conduct “red teaming” exercises, including stress testing, boundary testing, and adversarial testing. They had access to early versions of the GPT-4 model and identified initial risks that motivated further safety research.

The experts’ feedback led to technical mitigations and policy enforcement measures to reduce risks. However, many threats remain, and further evaluation is needed.

Talking about employees at OpenAI, ChatGPT was initially developed with assistance from individuals in some of the world’s poorest regions through OpenAI’s partnership with a company called Sama, which employs millions of workers from impoverished areas. Some experts in AI ethics have criticized OpenAI’s decision to outsource the training of its ChatGPT model to Sama, accusing the company of exploiting low-cost labor.

Read more:

Tags:

Disclaimer

In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.

About The Author

Agne is a journalist who covers the latest trends and developments in the metaverse, AI, and Web3 industries for the Metaverse Post. Her passion for storytelling has led her to conduct numerous interviews with experts in these fields, always seeking to uncover exciting and engaging stories. Agne holds a Bachelor’s degree in literature and has an extensive background in writing about a wide range of topics including travel, art, and culture. She has also volunteered as an editor for the animal rights organization, where she helped raise awareness about animal welfare issues. Contact her on [email protected].

Agne Cimerman