Hello Everyone,

I’m currently on my summer break, so this will only be a short note. But suffice to say I’ve been writing about Voice AI for more than a decade. I’m bullish on this overall trend now: in the next five years in Voice AI are going to be amazing (2024-2029) and connect AI with apps, utility and eventually with agentic AI capabilities. This will take a while to get really good. However:

We’re about to enter a golden era of Voice AI capabilities.

Subscribe now

Also Read:

🛠️ AI for Non-techies: Meeting AI assistants and Notes.

🔧 AI for Non-techies: AI Tools for Email writing and Inbox management.

OpenAI, Apple, Google and Amazon

With both Siri (Apple) and Alexa (Amazon) due for a Generative AI upgrade, it’s fascinating to witness OpenAI’s advanced voice mode and recently the launch of Gemini Live. Let’s think about Gemini Live for a second.

As we try to be patient about the more useful aspects of Apple Intelligence, Gemini Advanced subscribers can use Gemini Live for conversational voice chat. That’s $19.99/a month in case you were wondering.

With the Pixel Buds Pro 2, Gemini Live could get interesting. The synergy of Google’s products are getting a fairly significant upgrade by 2025. Gemini Live, with Pixel Buds Pro 2 can provide a mobile conversational experience that uses Google’s state-of-the-art speech technology to let you have extended conversations with Gemini. You can also witness how Gemini Live really is competitor of OpenAI’s voice-assistant. Amazon, Apple and Microsoft will need to play catch-up.

BigTech is going all-in on providing more value in its AI subscriptions. Gemini Live offers smartphone users a more natural conversational experience with the Gemini AI chatbot but also deeply integrated with Google’s other products that could become convenient and useful over time.

Gemini Live

At least Generative AI upgrades the voices of AI. Finally.

Lean about Gemini Advanced Subscription

Google says that conversations with Gemini Live can be “free-flowing,” so you can do things like interrupt an answer mid-sentence or pause the conversation and come back to it later. Greater contextual awareness is certainly going to improve.

Initially, Gemini Live is available to anyone with a modern Android handset, a $19 per month Gemini Advanced subscription and language settings set to English.

Upgrade to Premium

While OpenAI’s advanced voice mode is great for interacting with ChatGPT and their best frontier model, Google Live enables me to seamlessly interact with apps like Gmail, Google Calendar, YouTube and other Google apps, at least in theory. This could start to become actually useful in the coming months. In the meantime, you might want to consider whether the Gemini Advanced subscription is really worth the price. I think Google will do a fair job of improving the value of this subscription over the next 6 to 18 months. Judging by the revenue OpenAI is bringing in with its ChatGPT subscription, can Google find similar success?

Which brings us to a key point a lot of the news around Gemini Live is missing:

The Real Deal Breaker for Gemini Live 🚀

As Google acquired the model of Character.AI, in an expensive listening acqui-hire deal, Google Live will be integrated with AI-companionship level capabilities likely by 2025.

This could mean Gemini Live overtakes other competitors in the near future not just in terms of convenience in Google and Android’s ecosystem, but in terms of mental health and AI-powered chat that’s more helpful psychologically.

Visit Character.AI

Character.AI co-founders Noam Shazeer and Daniel De Freitas will rejoin Google, where they worked until 2021, alongside more than two dozen Character AI researchers. This is like Microsoft acquiring Inflection AI and a significant part of their team (Pi Chat). Except Google got the better company and more compelling product.

Google basically signs a non-exclusive license for Character.AI’s models, and also buys venture investors at around a $2.5 billion valuation. The size of its total investment hasn’t been disclosed. These quasi-acquisitions of Adept by Amazon, Inflection by Microsoft and now Character.AI by Google are really notable. But this one is to me by far the most impactful.

Google demos aside, I actually think Gemini Live has a lot of potential. Note that Gemini Advanced as a subscription service will be free for Pixel 9-series owners. If you can trust Google with hardware, this is a fairly competitive AI augmented device if you trust Google can improve Gemini Live and other features.

Character.AI is dead (not officially), but long live Gemini Live and whatever it can become. So would I want to use both GPT-4o Advanced Voice and Gemini Live and potentially improved Amazon Alexa devices and Siri? It will all depend on how good these Voice-AIs become. I’m less optimistic about Amazon and Microsoft’s ability to keep up.

🏁 The Race for Voice-AI is On 🙏

An Interoperable and Ambient Google AI?

✨ The Gemini Live icon, a waveform with a sparkle, appears in the bottom right corner of the Gemini overlay and fullscreen app. I wonder if you owned a Google Pixel Watch 3 would you could do with this. Will Google become an ambient computing company? There’s far more potential here than meets the eye, as much as we like to make fun of Google in the recent past.

Google is betting that Gemini Advanced will bundle well as they keep adding features and products. It all depends on the integrations and interoperability of something like Gemini Live with Google’s other apps and products. Think of some the products you might use in this ecosystem:

Gmail

Google Calendar

Google Search

Google Maps

Google Photos

YouTube

Google Docs

Google Tasks

Google Keep, and so many more.

Gemini Live could be one of the main ways you start to interact with these apps that you already use.

The new Gemini Live became available in English to Gemini Advanced subscribers on Android phones on Tuesday (Aug. 13) and will be expanded to iOS devices and more languages in the coming weeks.

Gemini Advanced users are getting Gemini Live but in all honesty I think it takes them at least until 2025 to get this right.

In recent months Google has really started to win the AI talent wars with a lot of talent in the space going to work at Google DeepMind, including a bunch of OpenAI’s own talent. In early August, 2024 Gemini 1.5 Pro (experimental 0801) model showed a lot of improvements on benchmarks.

Functionally what to Expect from Gemini Live?

Google also has 10 new Gemini voices for users to pick from, with names like Ursa and Dipper.

Gemini Live lets you interrupt the conversation without disrupting the entire experience.

Gemini Live is nowhere near a polished product on launch, and you’ll soon get used to interrupting it.

How about that Demo of August 13th?

So far reviews of Google Gemini on Twitter, Reddit and other places have been quite mixed. Gemini Live is exceedingly important to Google’s consumer AI strategy so they basically have to get this right.

Apple’s much-anticipated rollout of Apple Intelligence has been delayed until Spring 2025, according to some recent reports. If Google can get Gemini Live right well before Apple, Amazon, Microsoft and others with important ecosystems, it will be very good for their future AI leadership. Google Assistant never really lived up to its potential, I think there’s a lot more riding on how well Gemini Live does in the second half of the 2020s.

Forget AGI, Ambient Computing by 2035

With generative AI moving to phones, the market is going to get a lot more interested for next-gen Apps and Generative AI experiences in how they integrate different things in our lives.

When AI becomes more of a hands-free experience I think we begin to be more augmented rather than bogged down by it. Agentic AI will take years to get good, but all of this is coming together in the 2020s, and it’s fairly exciting to witness.

Voice AI convenience and capabilities could have year in 2025 and especially in 2026. Google Live is relatively speaking a first mover now in 2024.

Google has incredible potential to keep building AI into its ecosystem even pending antitrust considerations, my hunch is their AI leadership reasserts itself in 2025. Gemini Live working well will be key for that to happen.

Pixel Watch 3

Watch Video Reviews:

CNET

Matt Berman (go to 10:55).

Watching OpenAI with their ChatGPT, advanced voice mode and SearchGPT will keep Google real. Ultimately that’s good for consumers.

My gut tells me Google Live will get a lot better next year in 2025. What do you think?

Leave a comment

Read More in  AI Supremacy