Google is expected to announce the next generation AI model of the Gemini family in early December, one year after the launch of Gemini 1. This is expected to be a bigger change than the Gemini 1.5 version announced in May.
According to The Verge, despite being a big step up from Gemini 1, the new model is not as powerful as Google would like. This could be because Gemini 1.5 was better than expected, or it could be that they have reached a leveling point where features begin to matter more than overall performance and capabilities.
OpenAI is branching its model by creating a new o1 family that is good at reasoning but not so good at other tasks. Then there is the more versatile GPT-4o (Omni) model. It is possible that Google will follow a similar path with Gemini 2.
AI labs have a habit of making big announcements for the holiday season and leaving it until the New Year. Gemini 2 will probably be no different. Google will announce new variations of Ultra and Pro, but it won't reach Gemini App until 2025.
Each new generation of models could bring new features, new training data sets, and even new ways to prompt over previous versions. Based on AI's scaling law of computation + data + time = better models, each new generation should be more intelligent, more capable, and have better reasoning.
It is unclear what new features will be added in Gemini 2. When Gemini 1 was released, we saw multimodal capabilities, including the ability to understand images and videos. Google could perhaps expand on this and include spatial data to give knowledge about the world and real world physics. We saw hints of this in Project Atlas (Gemini Live + Lens).
I think it is more likely that we will see broad improvements in reasoning and reliability. Also, some of these “thinking” capabilities may be unlocked in the broader model. The most significant changes are likely to come in the form of agents.
This is the ability of a model to allow itself to perform tasks without relying on human input beyond initial prompting. For example, if you tell Gemini to book a flight to Paris with certain parameters, Gemini will book it for you and send you the ticket. Therefore, this would be another capability. This will not only improve accuracy, but also allow for more detailed responses. Google will also probably improve its search and access to live data, as it is increasingly competing with OpenAI.
Comments