News Report Technology
June 03, 2025

Sakana AI Introduces Self-Improving Agent That Boosts Performance By Up To 50% On SWE-Bench

In Brief

Sakana AI launched the Darwin Gödel Machine, a self-improving agent that boosts performance by up to 50.0% on SWE-bench and by up to 30.7% on Polyglot.

Sakana AI Introduces Self-Improving Agent That Boosts Performance By Up To 50% On SWE-bench

Japanese AI company Sakana AI introduced the Darwin Gödel Machine (DGM), a self-modifying agent capable of altering its own code. Drawing inspiration from evolutionary principles, the system maintains a growing lineage of agent variants, enabling ongoing exploration within the broad range of self-improving agent designs.

While current agent systems are typically static and unchanging after deployment, the DGM emphasizes continuous self-improvement as a crucial factor for advancing AI capabilities. The machine is designed to support AI systems that can learn and evolve their abilities over time, similarly to human development.

The DGM represents a notable advancement toward AI systems capable of autonomously identifying and building upon their own learning milestones to continually innovate. The system expands its archive by selecting an agent from its existing collection and employing a foundation model to generate a new, improved variant of that agent. This process of open-ended exploration creates a growing tree of diverse, high-quality agents, enabling simultaneous exploration of multiple pathways within the search space. 

Empirical results demonstrate that the DGM enhances its coding abilities over time—improving tools such as code editing, long-context management, and peer-review mechanisms—leading to increased performance on benchmarks like SWE-bench (from 20.0% to 50.0%) and Polyglot (from 14.2% to 30.7%). The system consistently outperforms baseline models that lack self-improvement or open-ended exploratory capabilities.

Notably, the evolution toward the most effective agent sometimes involved intermediate agents that performed worse than their predecessors but were retained in the lineage, illustrating the advantages of an open-ended search strategy. This approach preserves a diverse archive of useful intermediate agents rather than exclusively focusing on branching from the highest-performing agent, demonstrating that progress does not always follow a linear path.

The research further indicates that the improved performance of agents discovered by the DGM can be generalized across different foundation models, such as transferring from Claude to o3-mini, and across various programming languages and task domains, including Python, Rust, C++, Go, and others.

Sakana AI: Developing AI Systems Inspired By Nature And Collective Intelligence

Sakana AI is an AI research company based in Tokyo that focuses on developing AI systems inspired by natural processes. The company’s approach involves integrating multiple smaller, autonomous models to form a collective intelligence, similar to how a school of fish operates. This method differs from traditional large-scale AI models by prioritizing adaptability, resource efficiency, and long-term sustainability.

Among Sakana AI’s research projects is the “Evolutionary Model Merge” technique, which applies evolutionary algorithms to combine existing AI models. This process generates new models with targeted capabilities while minimizing the need for extensive computational power. Additionally, Sakana AI has developed the “AI Scientist,” a system designed to automate scientific research by allowing foundation models to independently carry out investigations and discovery processes.

Disclaimer

In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.

About The Author

Alisa, a dedicated journalist at the MPost, specializes in cryptocurrency, zero-knowledge proofs, investments, and the expansive realm of Web3. With a keen eye for emerging trends and technologies, she delivers comprehensive coverage to inform and engage readers in the ever-evolving landscape of digital finance.

More articles
Alisa Davidson
Alisa Davidson

Alisa, a dedicated journalist at the MPost, specializes in cryptocurrency, zero-knowledge proofs, investments, and the expansive realm of Web3. With a keen eye for emerging trends and technologies, she delivers comprehensive coverage to inform and engage readers in the ever-evolving landscape of digital finance.

Hot Stories

Why Polkadot Is the Backbone Web3 Has Been Missing

by Victoria d'Este
June 04, 2025
Join Our Newsletter.
Latest News

The Calm Before The Solana Storm: What Charts, Whales, And On-Chain Signals Are Saying Now

Solana has demonstrated strong performance, driven by increasing adoption, institutional interest, and key partnerships, while facing potential ...

Know More

Crypto In April 2025: Key Trends, Shifts, And What Comes Next

In April 2025, the crypto space focused on strengthening core infrastructure, with Ethereum preparing for the Pectra ...

Know More
Read More
Read more
Aurora Rolls Out ACC Marketplace: One-Click Stack For Custom Blockchain Deployments
News Report Technology
Aurora Rolls Out ACC Marketplace: One-Click Stack For Custom Blockchain Deployments
June 4, 2025
Why Polkadot Is the Backbone Web3 Has Been Missing
Press Releases Business Markets Technology
Why Polkadot Is the Backbone Web3 Has Been Missing
June 4, 2025
ZKcandy Turns AI and Blockchain into Pure Gameplay Magic
Interview Business Markets Technology
ZKcandy Turns AI and Blockchain into Pure Gameplay Magic
June 4, 2025
Lido Approves Snapshot For CSM V2 To Enhance Decentralization, Introduce Identified Community Stakers, And Optimize Rewards
News Report Technology
Lido Approves Snapshot For CSM V2 To Enhance Decentralization, Introduce Identified Community Stakers, And Optimize Rewards
June 4, 2025