NIST’s Unpublished AI Risk Study Remains Shelved Amid Administrative Change

by Alisa Davidson

Published: August 07, 2025 at 10:20 am Updated: August 08, 2025 at 8:42 am

by Ana

Edited and fact-checked: August 07, 2025 at 10:20 am

In Brief

A NIST-led red-teaming exercise at CAMLIS, evaluated vulnerabilities in advanced AI systems, assessing risks like misinformation, data leaks, and emotional manipulation.

NIST’s Unpublished AI Risk Study Remains Shelved Amid Administrative Change

The National Institute of Standards and Technology (NIST) completed a report on the safety of the advanced AI models near the end of the Joe Biden administration, but the document was not published following the transition to the Donald Trump administration.

In October last year, a computer security conference in Arlington, Virginia brought together a group of AI researchers who participated in a pioneering “red teaming” exercise aimed at rigorously testing a state-of-the-art language model and other AI systems. Over the span of two days, these teams discovered 139 new methods to cause the systems to malfunction, such as producing false information or exposing sensitive data. Crucially, their findings also revealed weaknesses in a recent US government standard intended to guide companies in evaluating AI system safety.

Although the report was designed to assist organizations in evaluating their AI systems, it was among several NIST-authored AI documents withheld from release due to potential conflicts with the policy direction of the new administration.

Prior to taking office, President Donald Trump indicated his intent to revoke Biden-era executive orders related to AI. Since the transition, the administration has redirected expert focus away from areas such as algorithmic bias and fairness in AI. The AI Action Plan released in July specifically calls for revisions to NIST’s AI Risk Management Framework, recommending the removal of references to misinformation, Diversity, Equity, and Inclusion (DEI), and climate change.

At the same time, the AI Action Plan includes a proposal that resembles the objectives of the unpublished report. It directs multiple federal agencies, including NIST, to organize a coordinated AI hackathon initiative aimed at testing AI systems for transparency, functionality, user control, and potential security vulnerabilities.

NIST-Led Red Teaming Exercise Probes AI System Risks Using ARIA Framework At CAMLIS Conference

The red-teaming exercise was conducted under the Assessing Risks and Impacts of AI (ARIA) program by the NIST, in partnership with Humane Intelligence, a company that focuses on evaluating AI systems. This initiative was held during the Conference on Applied Machine Learning in Information Security (CAMLIS), where participants explored the vulnerabilities of a range of advanced AI technologies.

The CAMLIS Red Teaming report documents the assessment of various AI tools, including Meta’s Llama, an open-source large language model (LLM); Anote, a platform for developing and refining AI models; a security system from Robust Intelligence, which has since been acquired by CISCO; and Synthesia’s AI avatar generation platform. Representatives from each organization contributed to the red-teaming activities.

Participants utilized the NIST AI 600-1 framework to analyze the tools in question. This framework outlines multiple risk areas, such as the potential for AI to produce false information or cybersecurity threats, disclose private or sensitive data, or foster emotional dependency between users and AI systems.

Unreleased AI Red Teaming Report Reveals Model Vulnerabilities, Sparks Concerns Over Political Suppression And Missed Research Insights

The research team found several methods to circumvent the intended safeguards of the tools under evaluation, leading to outputs that included misinformation, exposure of private information, and assistance in forming cyberattack strategies. According to the report, some aspects of the NIST framework proved more applicable than others. It also noted that certain risk categories lacked the clarity necessary for practical use.

Individuals familiar with the red-teaming initiative expressed that the findings from the exercise could have offered valuable insights to the broader AI research and development community. One participant, Alice Qian Zhang, a doctoral candidate at Carnegie Mellon University, noted that publicly sharing the report might have helped clarify how the NIST risk framework functions when applied in real-world testing environments. She also highlighted that direct interaction with the developers of the tools during the assessment added value to the experience.

Another contributor, who chose to remain anonymous, indicated that the exercise uncovered specific prompting techniques—using languages such as Russian, Gujarati, Marathi, and Telugu—that were particularly successful in eliciting prohibited outputs from models like Llama, including instructions related to joining extremist groups. This individual suggested that the decision not to release the report may reflect a broader shift away from areas perceived as linked to diversity, equity, and inclusion ahead of the incoming administration.

Some participants speculated that the report’s omission may also stem from a heightened governmental focus on high-stakes risks—such as the potential use of AI systems in developing weapons of mass destruction—and a parallel effort to strengthen ties with major technology companies. One red team participant anonymously remarked that political considerations likely played a role in withholding the report and that the exercise contained insights of ongoing scientific relevance.

Tags:

Disclaimer

In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.

About The Author

Alisa, a dedicated journalist at the MPost, specializes in cryptocurrency, zero-knowledge proofs, investments, and the expansive realm of Web3. With a keen eye for emerging trends and technologies, she delivers comprehensive coverage to inform and engage readers in the ever-evolving landscape of digital finance.

Alisa Davidson