OpenAI Assembles a Team of 50+ Experts to Enhance GPT-4’s Safety

News Report Technology

In Brief

OpenAI has hired a team of over 50 experts to ensure that its newest language model, GPT-4, is safe for use.

The team includes researchers and engineers specializing in AI safety, ethics, and policy.

The aim is to prevent GPT-4 from generating harmful or biased content and to ensure that it aligns with human values.


The Trust Project is a worldwide group of news organizations working to establish transparency standards.

OpenAI Assembles a Team of 50+ Experts to Enhance GPT-4’s Safety

OpenAI has hired over 50 experts from various domains to make GPT-4 safer. The experts have been working with adversarial testing of the model to identify potential risks and vulnerabilities. They are experts from various areas: long-term AI alignment risks, cybersecurity, biorisk, and international security. Their findings have helped OpenAI evaluate model behavior in high-risk areas that require niche expertise. 

While the newest language model poses similar risks as smaller language models, the additional capabilities of GPT-4 lead to new threats. Thus, the engagement of experts has been crucial in ensuring the technology’s safety.

OpenAI has implemented an additional set of safety-relevant reinforcement learning from human preferences (RLHF) training prompts and rule-based reward models (RBRMs) to improve the safety of the GPT-4 model. The RBRMs are zero-shot GPT-4 classifiers that function as an extra reward signal for the GPT-4 policy model during RLHF fine-tuning. Their purpose is to incentivize appropriate behavior, such as declining to generate harmful content or not rejecting harmless requests.

To ensure the safety of the GPT-4 models, OpenAI began recruiting external experts in August 2022 to conduct “red teaming” exercises, including stress testing, boundary testing, and adversarial testing. They had access to early versions of the GPT-4 model and identified initial risks that motivated further safety research.

The experts’ feedback led to technical mitigations and policy enforcement measures to reduce risks. However, many threats remain, and further evaluation is needed.

Talking about employees at OpenAI, ChatGPT was initially developed with assistance from individuals in some of the world’s poorest regions through OpenAI’s partnership with a company called Sama, which employs millions of workers from impoverished areas. Some experts in AI ethics have criticized OpenAI’s decision to outsource the training of its ChatGPT model to Sama, accusing the company of exploiting low-cost labor.

Read more:

Disclaimer

Any data, text, or other content on this page is provided as general market information and not as investment advice. Past performance is not necessarily an indicator of future results.

Agne Cimermanaite

Agne is a journalist and writer with a background in literature, culture, and arts. She entered the Web3 space in 2021 and began writing about cryptocurrency and NFTs. Agne is passionate about technology and storytelling and is always on the lookout for exciting stories.

Follow Author

More Articles