Google on Tuesday introduced Gemini 2.5 Pro, a next-generation AI model that the company says represents its most intelligent system to date, featuring embedded reasoning capabilities and performance improvements across key AI benchmarks. The new model is being rolled out through Google AI Studio and the Gemini app for Gemini Advanced subscribers.

Described by Google as a "thinking model," Gemini 2.5 Pro is designed to pause and evaluate before generating responses. It's the company's latest move in the escalating AI race following the September 2024 launch of OpenAI's first reasoning model, "o1," which triggered an industry-wide push toward more analytical and context-aware systems.

Google stated it is "building these thinking capabilities directly into all of [its] models" to allow them to "handle more complex problems and support even more capable, context-aware agents."

Gemini 2.5 Pro, codenamed "nebula," is the first in a new line of AI models engineered to handle complex reasoning tasks, including agentic code generation, multimodal input comprehension, and long-context information synthesis. According to Google, the model integrates a significantly improved base architecture with advanced post-training techniques, enabling it to outperform earlier Gemini versions and some leading competitors.

Key performance metrics reported by Google include:

  • Aider Polyglot (code editing benchmark): 68.6%, ahead of OpenAI, Anthropic, and DeepSeek
  • SWE-Bench Verified (agentic software development): 63.8%, beating OpenAI's o3-mini and DeepSeek R1 but trailing Anthropic's Claude 3.7 Sonnet at 70.3%
  • Humanity's Last Exam (multimodal reasoning test): 18.8% without tool use, surpassing most flagship models

"2.5 Pro excels at creating visually compelling web apps and agentic code applications, along with code transformation and editing," Google said. The model also tops the LMArena leaderboard, which evaluates AI outputs based on human preferences, and leads in scientific benchmarks like GPQA Diamond and AIME 2025 without the use of cost-heavy inference tricks such as majority voting.

The rollout underscores Google's focus on developing AI agents-autonomous systems capable of executing tasks with minimal human input. Reasoning, which encompasses logic, contextual analysis, and decision-making, is seen by many in the industry as the foundational layer for such agents.

Gemini 2.5 Pro ships with a 1 million token context window-roughly the equivalent of 750,000 words-and Google says it will soon support inputs up to 2 million tokens. This extended memory allows the model to process complex documents, software repositories, and multimedia inputs including text, images, video, audio, and code.

Gemini 2.5 Pro replaces last month's 2.0 Pro (experimental) in the Gemini app, where users can now activate a "Show thinking" mode to observe the model's reasoning process. In addition to supporting @Gmail, @YouTube, and file uploads, it is expected to be integrated into Google's Vertex AI platform in the coming weeks.