OpenAI announced earlier this week that most users will have to wait until the fall before they can take advantage of the GPT-4o's advanced voice features, but it seems some lucky people received a sneak peak at what is possible with the next generation voice assistant
Reddit user RozziTheCreator was one of the lucky few They shared a recording of a new GPT-4o voice we had never heard before telling a horror story, along with story-related sound effects such as thunder and footsteps; AI writer Sambhav Gupta was the first to pick up the clip at X, where it received widespread attention
It appears that Lozzi was wrong to have been granted access; OpenAI told me in a statement that some users were mistakenly granted access to the model, but that this has now been corrected
Until now, all of GPT-4o's advanced voice videos have been under OpenAI's control and have sounded great, but their use has been limited
The new video by RozziTheCreator seems to demonstrate its capabilities in a more natural way, including a previously unheard of sound effect feature
When I messaged RozziTheCreator about this experience, they said: "It came out of nowhere The discovery came late at night, when RozziTheCreator was about to ask the chatbot a question: "Boom," they discovered the change
It only lasted a few minutes and was "very buggy," according to RozziTheCreator, so they did not have time to ask much out, but they managed to record this amazing piece of the story
According to RozziTheCreator, it "started going crazy, repeating and replying to things I never said"
In the video, GPT-4o can be heard enthusiastically telling the story in a casual manner backed by sound effects It explains: "Picture this, there's this little town, and there's this little house at the end of the street in the video that everyone and anyone knows about
The story continues with two teenagers checking out the house during a storm with "only a flashlight and a cell phone for light"
OpenAI is gradually rolling out a host of new features The first Plus users were supposed to get GPT-4o's advanced voice this month, but due to some security issues and concerns about whether the hardware infrastructure is in place, it has been delayed
I asked OpenAI what happened before RozziTheCreator became accessible: "While testing this feature, we accidentally sent an invitation to a small number of ChatGPT users This was a mistake and we fixed it"
They confirmed that the first few Plus users will get access next month, but for most it will take a while They explain that the initial rollout is "a plan to collect feedback and expand based on what we learn"
In other words, while there is no audio of GPT-4o yet, this is the latest in a series of examples where GPT-4o seems to want to break free of its shackles and reach its full potential; examples of GPT-4o analyzing an audio file directly and running it through code in the next moment I I have seen
As a result, I was even more excited about GPT-4o's full capabilities and even more frustrated by its delays
Comments