As previously written, Google Cloud NEXT 2025 announced a major upgrade for Gemini 2.5 Pro. Gemini 2.5 Pro continues to be favored by developers and is hailed as the best model for code writing. Gemini 2.5 Flash has also become even better through new updates. Google has also brought new features to the models, including Deep Think, an experimental enhanced reasoning mode for 2.5 Pro.

At this year’s Google I/O 2025 conference, Google released Gemini 2.5 Pro, which is its smartest model to date. Microfusion Technology will also share more updates on the Gemini 2.5 model series:

2.5 Pro is now the world’s leading model on the WebDev Arena and LMArena leaderboards, and performs exceptionally well in helping people learn.
Google has brought new features to 2.5 Pro and 2.5 Flash: native audio output for a more natural conversational experience, advanced security protection, and Project Mariner’s computer usage capabilities. 2.5 Pro will be further enhanced through Deep Think, an experimental enhanced reasoning mode for highly complex mathematics and code.
Introducing Thought Summaries in Gemini API and Vertex AI for increased transparency: Expanding thought budgets to 2.5 Pro for more control, and adding support for MCP tools in the Gemini API and SDK for access to more open-source tools.
2.5 Flash is now available to everyone in the Gemini application: Google will generally make the updated version available in Google AI Studio for developers and Vertex AI for enterprises in early June, with 2.5 Pro to follow.

2.5 Pro Performance Surpasses Expectations

Google recently updated 2.5 Pro to help developers create richer, more interactive web applications. We are pleased to see the positive response from users and developers, and Google continues to make improvements based on user feedback.

In addition to its excellent performance on academic benchmarks, the new 2.5 Pro now leads the popular code leaderboard WebDev Arena with an ELO score of 1415. It also leads all leaderboards on LMArena, which evaluates human preferences across various dimensions. Furthermore, with its 1 million token context window, 2.5 Pro possesses state-of-the-art long-context and video understanding capabilities.

Since the introduction of LearnLM (a family of models co-built by Google with education experts), 2.5 Pro is now also the leading model in the learning domain. In face-to-face comparisons evaluating its pedagogy and effectiveness, educators and experts preferred Gemini 2.5 Pro over other models in various scenarios. Moreover, it outperformed top models on each of the five learning science principles used to build learning AI systems.

Please read more on Google’s updated Gemini 2.5 Pro model card and Gemini technical page.Deep Think

By exploring the boundaries of Gemini’s thinking capabilities, Google has begun testing an enhanced reasoning mode called Deep Think, which uses new research techniques to enable the model to consider multiple hypotheses before responding.

2.5 Pro Deep Think achieved impressive scores on the 2025 USA Mathematical Olympiad (USAMO), one of the most challenging mathematical benchmarks today. It also leads on the difficult benchmark for competitive-grade code, LiveCodeBench, and scored 84.0% on MMMU, which tests multimodal reasoning.

As Google is defining the technical frontier with 2.5 Pro DeepThink, Google will take additional time for more frontier safety evaluations and to gather further feedback from safety experts. As part of this, Google will make this feature available to trusted testers via the Gemini API to get their feedback before a broader rollout.2.5 Flash Reaches New Heights

2.5 Flash is Google’s most efficient workhorse model, designed for speed and low cost—and it now performs better in many ways. It has improved on key benchmarks for reasoning, multimodal, code, and long context, while becoming even more efficient, reducing token usage by 20-30% in Google’s evaluations.

The new 2.5 Flash is now available for preview by developers in Google AI Studio, by enterprises in Vertex AI, and by everyone in the Gemini application.New Features in Gemini 2.5

Native audio output and Live API improvements are rolling out a preview of audio and video input and native audio output conversations, so you can directly build conversational experiences and use a more natural, expressive Gemini. It also allows users to control tone, accent, and speaking style. For example, you can have the model use a dramatic voice when telling a story. It also supports tool use, enabling it to perform searches on your behalf.

You can try a range of early features, including:

Affective Dialogue, where the model detects emotions in the user’s voice and responds appropriately.
Proactive Audio, where the model ignores background conversations and knows when to respond.
Thinking in the Live API, where the model utilizes Gemini’s thinking capabilities to support more complex tasks.

Google also released a new text-to-speech preview in 2.5 Pro and 2.5 Flash. These features for the first time support multiple speakers, enabling text-to-speech for two voices through native audio output. Like native audio conversations, text-to-speech is expressive and can capture very subtle nuances, such as whispering. It supports over 24 languages and can seamlessly switch between them. This text-to-speech feature will be available in the Gemini API later today.

Google is bringing the computer usage capabilities of Project Mariner into the Gemini API and Vertex AI. Companies like Automation Anywhere, UiPath, Browserbase, Autotab, The Interaction Company, and Cartwheel are exploring its potential, and Google is excited to make it more widely available for developers to experiment with this summer.

Google Cloud has also significantly increased protection against security threats, such as indirect prompt injection. This occurs when malicious instructions are embedded into data retrieved by an AI model. Google Cloud’s new security approach significantly improves Gemini’s protection rate against indirect prompt injection attacks during tool use, making Gemini 2.5 Google’s most secure model family to date.

2.5 Pro and Flash will now include Thought Summaries in the Gemini API and Vertex AI. Thought Summaries organize the model’s raw thoughts into a clear format, including headings, key details, and information about the model’s operations, such as when it uses tools.

Google hopes that through a more structured and streamlined format for model thought processes, developers and users can more easily understand and debug interactions with Gemini models.

Google launched 2.5 Flash with Thought Budget, allowing developers to better control costs by balancing latency and quality. Google is also extending this feature to 2.5 Pro. This allows you to control the number of tokens the model uses for thinking before responding, and even turn off its thinking ability.

Gemini 2.5 Pro with budget will be generally available for stable production in the coming weeks with Google’s generally available models.MCP Support

Google has added native SDK support for Model Context Protocol (MCP) definitions in the Gemini API to make it easier to integrate with open-source tools. Google is also exploring ways to deploy MCP servers and other hosted tools to make it easier for you to build agent applications.

Google is always committed to innovating new ways to improve Google’s models and developer experience, including making them more efficient, higher performing, and continuously responding to developer feedback, so please continue to provide feedback! Google also continues to double down on expanding and deepening Google’s fundamental research—pushing the frontier of Gemini’s capabilities. More to come.

The content of this article is translated and adapted from the official Google Cloud blog, delving into Google’s latest advancements in the Gemini 2.5 series of models, particularly the significant improvements in inference performance, efficiency, and new features of 2.5 Pro and 2.5 Flash. From the enhanced reasoning capabilities brought by Deep Think mode to the native audio output of the Live API and increased security, all demonstrate Google’s commitment to providing smarter, safer, and easier-to-use AI solutions.

These innovations have not only achieved excellent results on academic benchmarks but have also brought substantial value to developers and enterprises in practical applications. For example, Gemini 2.5 Pro

The Era of Gemini 2.5 Has Arrived! Google Cloud’s AI Smart Models Continue to Advance

Categories

Related Posts

Industry Insights: What Matters Most to Decision-Makers Regarding Multi-Agent Systems (MAS)?

Measuring Generative AI: From Experimentation to Core Business

Google AI Hypercomputer Inference Updates: TPU and GPU Boost AI Application Performance

Take the next step to your cloud journey.

TAIWAN HEADQUARTERS

HONG KONG BRANCH

MALAYSIA BRANCH