AI developers have been trying to crack the digital personal assistant nut for some time Gemini Live, announced at Made by Google earlier this week, is Google's new attempt to make this happen So I put this AI to the test for 24 hours to see how close I could get to a truly useful AI
I am not accustomed to chatting directly with AI assistants, beyond asking them to set the timer while I cook, but I wanted to see what the benefits of having an open-ended conversation with an AI like Gemini would be And after this day of testing, I am at least convinced of the value of conversing with an AI in this way
While my experiment with Gemini Live is far from a formal test of its capabilities, the range of questions it received from me gives a good impression of what it can and cannot do well Thus, I am confident in my assessment that Gemini Live would be a good addition to the Gemini package and would probably be reason enough for free users to become paid users of the $20/month Gemini Advanced Even if it has not yet achieved all its goals
Gemini Live is offered as part of the Gemini Advanced subscription, but as of this writing it is not yet available to all users Fortunately, I was able to try it out on a Google Pixel 9 Pro XL If you want to learn more about this phone, you can read my review of the Google Pixel 9 Pro XL Since the focus here is only on Gemini Live
Another drawback is that you currently have to set the language to US English to use Gemini Fortunately, even after setting this, I was able to choose from 10 different British voices for the Gemini chat, named “Capella” All of the voices sounded very natural, just with different levels of enthusiasm and voice pitch When I started asking questions, I seldom found particularly egregious mispronunciations or oddly phrased sentences
With everything set up, my first major interaction with Gemini Chat was to ask for directions home Gemini Live did not tell me what they found at first, after I told them my transportation of choice and confirmed the station I wanted to go to After a long wait, I prompted him to actually tell me what he had found, and he explained the route
I probably would have taken that route home But it would not have been the smoothest of journeys Gemini had mistaken one of the routes for one of the stations, not realizing that one of the routes I had changed required me to walk between two stations Gemini claims to have checked the information on the Transport for London website
This is more a problem with the underlying AI model than with Gemini Live, but having a voice that sounds authoritative (and has a British accent) suggesting routes can be very confusing to someone who is not very familiar with public transportation in London It would be better to stick to Google Maps for this sort of thing
The next day, while getting ready for work, I asked Gemini for the day's breaking news In just one prompt, he told me a lot about the change of host for Good Morning Britain and This Morning But things got even stranger when I asked about technology-related news
Google Gemini initially told me that Microsoft had announced the Surface Duo 3; the PS5 Slim is real, but it was released last fall, and I can only assume that the last comment referred to the Crowdstrike outage that occurred last month
I then asked Gemini Live to home in on the iPhone rumors, but initially the responses all related to the iPhone 15 lineup currently available When prompted further, he explained some rumors about the iPhone 16 camera, but not in much detail
After a few hours of work, it was time for a coffee break, so I tried to get Gemini Live to guide me through brewing a V60 pourover
I expected to get step-by-step instructions from the AI, but the problem here is that in order to get Gemini Live to effectively give me answers as steps, I would need to continually prompt or interrupt it However, Gemini Live was able to continue the conversation and gave a well-reasoned answer, despite mishearing my prompts
On the knowledge side, Gemini was cobbled together He offered enthusiast-level tips such as filtering the water before boiling The overall recipe, though simple, resulted in a cup that was easy to drink However, Gemini Live gave me the weight of the coffee in tablespoons, not grams or ounces With additional prompting, however, I was able to get the amount in grams
After lunch, I had a little time to talk a bit about Street Fighter 6, the game I'm playing the most at Gemini Live He correctly gave me the name of this year's Evo 2024 SF6 champion and his opponent, but still didn't give me much in the way of initial details
I moved the conversation to training advice (I tend to rely too heavily on certain techniques), where I got some suggestions on how to rethink my approach in a match It was easier said than done, but equally valid advice in a situation where my opponent was throwing fireballs at me
I also tried to get them to tell me where I could meet them in person, but this didn't quite work out I tried to find more information on the official website, but found that there was nothing there other than the official Capcom convention They then found a nearby Facebook group for me, but were unable to give me a link to access it later in the record
For my final Gemini task, I decided to go meta He helped me draft the preface to this very article
After experiencing Gemini not giving me much detail in my previous responses, I was surprised that Gemini would suggest specific wording When I asked to include more information or change the angle, Gemini responded in a logical way And, as Google proudly pointed out in the Made by Google demo, Gemini Live can respond to interruptions and adjust its answers on the fly
This is the best Gemini Live felt like, and iterating on an idea out loud feels completely natural, even when you're talking into a glowing waveform on your phone After all, I wrote the intro to this article from scratch But you can probably tell by scrolling back up to compare the sound of that final proposal with what this intro gave me
You might think from this article that I don't think highly of Gemini Live, but that is not the case This is because in some of the test scenarios, I seemed to have misunderstood what the Gemini Advanced Model was looking for Interestingly, our recent Gemini vs Gemini Advanced matchup shows that I might have been better off sticking with the basic Gemini
Gemini Live, on the other hand, was very impressive in its own right Being able to continue a conversation with a chatbot seems like a much better way to interact than via text or image prompts, as long as you are willing to explain specifically and interrupt if you get off track You can ask follow-up questions to a regular digital assistant, but it's not as seamless as Gemini Live proved And that seamlessness is what makes it practical, helps answer questions and provide guidance, and not only is it hands-free, but it frees up your eyes to focus on other things while you and the chatbot are talking [especially since Gemini Live interprets speech as text before responding, whereas ChatGPT Voice can process speech directly But even with the usual AI caveats, it feels like Google is on the right track in pursuing its dream of a digital personal assistant
Comments