OpenAI Rolls Out Gpt-oss-120b And Gpt-oss-20b, Bringing State-Of-The-Art Models To Local Devices

by Alisa Davidson

Published: August 06, 2025 at 4:30 am Updated: August 06, 2025 at 4:30 am

by Anastasiia O

Edited and fact-checked: August 06, 2025 at 4:30 am

In Brief

OpenAI has released two powerful open-weight models, gpt-oss-120b and gpt-oss-20b, enabling advanced local AI performance without internet access, marking a major step in developer accessibility.

OpenAI Rolls Out Gpt-oss-120b And Gpt-oss-20b, Bringing State-Of-The-Art Models To Local Devices

Artificial intelligence research organization OpenAI announced the release of two advanced open-weight language models named gpt-oss-120b and gpt-oss-20b. These models offer strong performance in practical applications while maintaining low operational costs. Released under the flexible Apache 2.0 license, they surpass other open models of similar size in reasoning tasks, exhibit robust tool-use capabilities, and are optimized for efficient operation on consumer-grade hardware. The training process involved reinforcement learning techniques combined with insights derived from OpenAI’s most advanced internal models, including o3 and other cutting-edge systems.

The gpt-oss-120b model performs nearly on par with OpenAI’s o4-mini model on fundamental reasoning benchmarks and runs efficiently on a single 80 GB GPU. Meanwhile, the gpt-oss-20b model achieves results comparable to OpenAI’s o3-mini on common benchmarks and is capable of operating on edge devices with only 16 GB of memory, making it suitable for on-device applications, local inference, or rapid testing without requiring expensive infrastructure. Both models demonstrate strong abilities in tool use, few-shot function calling, chain-of-thought (CoT) reasoning as demonstrated in Tau-Bench agentic evaluation, and HealthBench, at times outperforming proprietary models such as OpenAI o1 and GPT-4o.

These models are compatible with the Responses API and are designed to integrate within agentic workflows, offering advanced instruction-following, tool use including web search and Python code execution, and reasoning capabilities. This includes adjustable reasoning effort to optimize for tasks that do not require complex reasoning or that prioritize low latency in final outputs. Both models are fully customizable, support full chain-of-thought reasoning, and accommodate structured output formats.

Safety considerations are central to the release of these models, particularly given their open nature. Alongside comprehensive safety training and evaluations, an additional layer of testing was applied through an adversarially fine-tuned version of gpt-oss-120b under OpenAI’s Preparedness Framework. The gpt-oss models achieve safety benchmark performance comparable to OpenAI’s latest proprietary models, providing developers with similar safety assurances. Detailed results and further information are available in a research paper and model card, with the methodology reviewed by external experts, representing progress in establishing new safety standards for open-weight models.

OpenAI has collaborated with early partners such as AI Sweden, Orange, and Snowflake to explore real-world uses of these open models, including on-premises hosting for data security and fine-tuning on specialized datasets. The availability of these open models aims to empower a broad range of users—from individual developers to large enterprises and government entities—to run and customize AI on their own infrastructure. When combined with other models accessible via OpenAI’s API, developers can select from a range of options balancing performance, cost, and latency to support diverse AI workflows.

gpt-oss is a big deal; it is a state-of-the-art open-weights reasoning model, with strong real-world performance comparable to o4-mini, that you can run locally on your own computer (or phone with the smaller size). We believe this is the best and most usable open model in the…
— Sam Altman (@sama) August 5, 2025

Gpt-oss-120b And Gpt-oss-20b Now Freely Available With Extensive Platform And Hardware Support

The weights for both gpt-oss-120b and gpt-oss-20b are openly accessible for download on Hugging Face and are provided with native quantization in MXFP4 format. This enables the gpt-oss-120b model to operate within an 80GB memory capacity, while the gpt-oss-20b model requires only 16GB. Both models have undergone post-training using the harmony prompt format, and an open-source harmony renderer is available in Python and Rust to facilitate adoption. Additionally, reference implementations are provided for running inference using PyTorch and Apple’s Metal platform, along with a set of example tools for practical application.

These models are engineered for flexibility and ease of use, supporting deployment locally, on-device, or through third-party inference providers. To enhance accessibility, partnerships were established prior to launch with major deployment platforms including Azure, Hugging Face, vLLM, Ollama, llama.cpp, LM Studio, AWS, Fireworks, Together AI, Baseten, Databricks, Vercel, Cloudflare, and OpenRouter. Collaboration with hardware manufacturers such as NVIDIA, AMD, Cerebras, and Groq was also undertaken to ensure optimal performance across various systems.

In conjunction with this release, Microsoft is delivering GPU-optimized versions of the gpt-oss-20b model for Windows devices. Powered by ONNX Runtime, these versions support local inference and are accessible via Foundry Local and the AI Toolkit for VS Code, simplifying the integration process for developers on Windows platforms.

For developers seeking fully customizable models capable of fine-tuning and deployment within their own environments, the gpt-oss models provide a suitable solution. Conversely, for those requiring multimodal capabilities, built-in tools, and seamless platform integration, the models offered through the API platform remain the preferred choice. Developer feedback continues to be monitored, with potential consideration for future API support of gpt-oss models.

The introduction of gpt-oss-120b and gpt-oss-20b represents a notable advancement in the domain of open-weight models, delivering significant improvements in reasoning abilities and safety at their scale. These open models complement proprietary hosted models by offering developers a broader selection of tools to facilitate cutting-edge research, stimulate innovation, and promote safer, more transparent AI development across diverse applications.

Furthermore, these open models help reduce entry barriers for emerging markets, resource-limited sectors, and smaller organizations that may face constraints in adopting proprietary solutions. By providing accessible and powerful tools, users worldwide are empowered to develop, innovate, and create new opportunities. Widespread availability of these capable open-weight models produced in the United States contributes to the expansion of equitable AI access.

A reliable ecosystem of open models is an essential component in promoting broad and inclusive AI accessibility. Developers and researchers are encouraged to utilize these models for experimentation, collaboration, and pushing the boundaries of what is achievable. The ongoing progress in this field is anticipated with interest.

Tags:

Disclaimer

In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.

About The Author

Alisa, a dedicated journalist at the MPost, specializes in cryptocurrency, zero-knowledge proofs, investments, and the expansive realm of Web3. With a keen eye for emerging trends and technologies, she delivers comprehensive coverage to inform and engage readers in the ever-evolving landscape of digital finance.

Alisa Davidson