Google has released the Gemini 2.0 Flash Thinking model to introduce extra reasoning abilities to its Gemini 2.0 Flash AI series. The new model has also been developed to demonstrate its “thinking process,” enhance reasoning capabilities, and compete with the existing top AI systems available in the market, such as OpenAI’s O1 range.
This experimental model is available in Google AI Studio and Vertex AI. It can also be integrated with other developers through the Gemini API integration. In a post on X (formerly Twitter), DeepMind’s chief scientist, Jeff Dean, highlighted the model's functionalities. Dean demonstrated how the Thinking Mode breaks down a problem into smaller components, allowing the model to present its reasoning steps before delivering a solution.
A demo video shown by Dean showed how the model can solve a physics problem and how it does it. As another example, Logan Kilpatrick, the product lead at Google AI Studio, presented that the model can solve mathematical problems with text and image inputs.
The Gemini 2.0 Flash Thinking model incorporates a specific dialogue module, which provides a graphical depiction of its decision-making process. This feature is very useful in showing users how this model approaches the solution to a particular problem with the input data.
During the shared demo, specific issues were advanced and solved by dividing big issues into smaller ones. All these steps were performed in real-time, which helped the user follow the decision-making process that led to the last output.
The Thinking Mode is integrated as an extension of the Gemini 2.0 Flash model, including its multimodal capabilities. It involves long contextual reasoning, image and audio processing, and agent-based interactions. This update represents Google’s attempts to advance the explainable AI and make the process behind machine learning more transparent.
Earlier this month, Google presented the new version of Gemini 2.0 with several innovations in AI technology. The base model of Gemini 2.0 Flash can perform multimodal reasoning and long-context understanding. It has also set the groundwork for agency-like experiences, enabling users to engage with AI agents in a specific manner.
This series showcased several AI prototypes, including Project Astra, a universal assistant capable of processing visual and audio data. Another prototype is Project Mariner, which sends instructions across browser data and accomplishes tasks through an innovative experimental Chrome extension. Furthermore, Jules, an AI coding assistant, helps developers with programming tasks, and gaming agents enhance gameplay by offering real-time suggestions.