TechingToday

What can we expect from today's OpenAI event?

General

OpenAI will hold an event alongside Google, Apple, and Microsoft to promote its new products.

So what can we expect from OpenAI's first official launch event? Probably not what you would expect, as the company is starting to focus more on the product than the model.

Rumors had suggested that there would be a new search engine and GPT-5, but according to CEO Sam Altman, that is not likely to happen.

He said in X that the announcement "is neither a GPT-5 nor a search engine. [This will not be a voice assistant like Siri or Alexa, but more like Samantha from the movie "her/one girlfriend in the world."

I agree with the rumor that a voice assistant of some sort is most likely to be announced at the OpenAI event.

However, a true voice assistant would require a significantly upgraded model that includes improved speech recognition and speech analytics. This would likely mean a new version of the already powerful OpenAI Whispering transcription model.

We might also have new assistants behave more like agents. This means that the AI will perform actions on your behalf, independently on the wider open web.

These alternative models and potential agents could be included in ChatGPT Plus, the premium plan for OpenAI's flagship product.

If ChatGPT Plus gets a major upgrade, it would mean an upgrade of the free version, eventually bringing GPT-4 and DALL-E.

In the movie "her/one girlfriend in the world," the AI character Samantha is designed to adapt and grow through her interactions with humans. Over time, Samantha develops self-awareness, emotional depth, and the ability to form meaningful connections.

We have seen hints of OpenAI leaning in that direction; ChatGPT can remember what you type and use it in future conversations. Also, if you have interacted with a voice agent in the ChatGPT app, it will sound more emotional, including human-like pauses and inflections.

I don't believe for a minute that something on the scale of Samantha will emerge. But if OpenAI can develop an improved end-to-end voice AI that acts on your behalf and integrates into other devices, that would be a "magical" moment.

The biggest change would be the move to speech synthesis. Currently, ChatGPT Voice converts your speech into text, sends that text to the AI model, gets the text back, and converts it to speech. This creates a time lag that does not work well for conversation.

Unlike Siri or Gemini, where you ask a question, interact, and wait, hoping that the AI is trained or programmed to have the answer, the new true voice assistants just have natural, human-like conversations.

Agents are the next big trend in artificial intelligence. Agents are mini-AI models controlled by a main model like GPT-4, but capable of handling tasks on their own.

For example, if you tell ChatGPT, "It's my wife's birthday, but I forgot," ChatGPT can find a gift based on what you have said about her in the past, order the gift, send a message to her and arrange delivery.

An example of such a "swarm" of agents can be seen in the AI developer platform Devin, which, when you tell it what to make, performs all the necessary actions to accomplish its goals, from web browsing to downloading images.

More Sora videos will be released and you may find out when they will be available to the public. We may also get a first idea of how well the Voice Engine ElevenLabs alternative actually works.

The focus will likely be on the product rather than the basic model. We are entering the commercial age of AI.

This is not to say that new models will not emerge. Altman has already stated that GPT-5 will be a significant improvement over GPT-4 and that a large sum of money will be invested to develop a super-intelligent AI.

This event feels like an AI lab entering the commercial arena and telling the world, "We are as serious about our product division as we are about our research."