Google DeepMind Releases Gemini 2.5 Pro And Flash Models, Introduces Flash-Lite 2.5 In Preview


In Brief
Google DeepMind’s Gemini 2.5 Flash and Pro models are now generally available, with the 2.5 Flash-Lite—its most cost-efficient and fastest model in the 2.5 series—introduced in preview.

AI division of the technology company Google, Google DeepMind has released its Gemini 2.5 Pro and Gemini 2.5 Flash models, making them generally available. A preview version of Gemini 2.5 Flash-Lite has also been introduced, which is positioned as the most cost-effective and fastest model in the 2.5 series to date.
Gemini 2.5 Pro is the most advanced model in the Gemini series, intended for tasks requiring complex reasoning, code generation, problem-solving, multimodal input processing, and extended context understanding. The model supports multimodal inputs including text, images, audio, video, and documents, and currently features a context window of approximately one million tokens, with an upcoming expansion to two million. It incorporates structured reasoning mechanisms and a capability referred to as Deep Think, which enables parallel processing of complex reasoning steps. Performance metrics reportedly show high scores in areas such as coding, scientific reasoning, and mathematics, based on results from benchmark tests including LMArena and Humanity’s Last Exam.
Gemini 2.5 Flash is a high-throughput model optimized for efficiency and cost, while maintaining strong performance in general-use scenarios. It has been available since mid-June 2025 and includes reasoning capabilities by default, which can be modified through API settings. The model demonstrates improvements in benchmarks related to coding, reasoning, long-context comprehension, and multimodal functionality. Token efficiency has also been increased, with reductions in cost per operation—listed at $0.30 for one million input tokens and $2.50 for one million output tokens.
According to the announcement, Gemini 2.5 models are currently in use by developers and organizations such as Spline, Rooms, Snap, and SmartBear for production-level applications.
Google DeepMind Unveils Gemini 2.5 Flash-Lite Preview, Enhancing Performance And Efficiency For High-Volume, Low-Latency AI Tasks
A preview release of Gemini 2.5 Flash-Lite has been introduced, described as the fastest and most cost-efficient model in the 2.5 family to date. The model is currently available for early use, with feedback from developers being encouraged during the preview phase.
Gemini 2.5 Flash-Lite is reported to show improvements across multiple performance areas—including coding, mathematics, scientific reasoning, and multimodal tasks—when compared to its predecessor, Gemini 2.0 Flash-Lite. It is optimized for high-volume, low-latency applications such as translation and classification, and exhibits reduced response times relative to both 2.0 Flash-Lite and 2.0 Flash across a diverse set of input prompts.
The model includes key functions found in other Gemini 2.5 versions, such as the ability to activate reasoning processes within variable budget limits, integration with external tools like Google Search and code execution systems, support for multimodal input, and a one million-token context window.
The preview version of Gemini 2.5 Flash-Lite is currently accessible through Google AI Studio and Vertex AI, where it is offered alongside the stable versions of Gemini 2.5 Flash and Gemini 2.5 Pro. These models are also available through the Gemini mobile app, and customized deployments of 2.5 Flash and Flash-Lite have been integrated into Google Search services.
Disclaimer
In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.
About The Author
Alisa, a dedicated journalist at the MPost, specializes in cryptocurrency, zero-knowledge proofs, investments, and the expansive realm of Web3. With a keen eye for emerging trends and technologies, she delivers comprehensive coverage to inform and engage readers in the ever-evolving landscape of digital finance.
More articles

Alisa, a dedicated journalist at the MPost, specializes in cryptocurrency, zero-knowledge proofs, investments, and the expansive realm of Web3. With a keen eye for emerging trends and technologies, she delivers comprehensive coverage to inform and engage readers in the ever-evolving landscape of digital finance.