Meet Udio - the most realistic AI music production tool I've ever tried

Meet Udio - the most realistic AI music production tool I've ever tried

Udio, the latest artificially intelligent music tool to hit the market, has come out of stealth with a bang, showcasing its uncanny ability to capture emotion in synthetic vocals.

The brainchild of a former Google DeepMind engineer, the platform has already attracted investment and attention from some in the music community, including will.i.am and Common.

Prior to its big launch on X and other platforms, a handful of songs were leaked, leading to speculation about how good this new AI tool would be. I've been trying it out for a little over a week, and in my opinion, it's a sola moment for AI music.

The ability to create complete tracks from text prompts is the same as Suno, which is still an impressive tool, but the vocals are much better and sound more natural.

The ability to create not only the emotion of a song, but also the bizarre and unexpected while maintaining musical fidelity and cohesiveness is astounding. For example, I generated all of the songs in the story, blending unusual genres with ease.

I had the opportunity to speak with the founders David Ding and Andrew Sanchez about Udio, and they told me that it was inspired by their desire to make it easier to create and share music.

"This is a magic moment," Sanchez said." Creating something from scratch is truly magical for people." That's why they decided to focus, at least initially, on making it possible to create a complete song from text.

Future updates will include more musician-specific tools, such as the ability to add reference vocals, more granular creation options, and easier import of external tracks. For now, the focus will be on building a library of great tracks inspired by those with little or no musical talent, or very few if any.

The two did not reveal the basic architecture of the model or the training data, but said they have strong copyright protection measures in place. For example, it is not possible to reference a specific artist, as Suno does.

Like other AI tools, it starts with text. When you enter a prompt and click generate, it creates two completely different tracks on that theme. However, you can also add your own lyrics, make it an instrumental song, or add more specific genre tags to determine the direction of generation.

After a week of use, we found that the most accurate generation was achieved by giving a rough one-line lyric and story to orient the text model, followed by a descriptive genre to orient the music model.

Once the tracks are generated, the task is split up, first using a traditional large-scale language model to create the lyrics, and then using a diffusion transformation model, such as those found in OpenAI's Sora and Stable Diffusion 3, to create the music.

Users can publish tracks for the community to enjoy, download audio and video files to share on other social media platforms, and develop them into other projects.

One use case noted by the team and the artists they worked with is the possibility of using Udio as a compositional aid. They can receive a set of lyrics, define a melody, and instantly create a demo and send it to the artist to record in a real studio.

"This is a whole new renaissance, and Udio is the tool for creativity in this era.

In less than a minute I was able to create a haunting but foot-stomping gothic bluegrass track about a haunted house. I was able to select one of the generated tracks and expand it with fine control, adding intros, pre/post segments, and outros.

The resulting song was surprisingly effective, despite the supposed jumble of genres. The AI model was able to create something compelling, original, and somewhat bizarre, all from text.

The team continues to discover new skills that Udio didn't realize he had. We recently discovered that Udio can play traditional Chinese folk music. We've also been able to listen to Korean, Japanese, and other languages."

"Nothing compares to the ease of use, sound quality, and musicality we have achieved with Udio.

In the future, they are working on adding support for more languages, the ability to split stems from individual tracks, and even the ability to specify vocalists.

One thing that could be considered is that Udio could be used instead of sending gifs. Or, it would allow people to express themselves and share their emotions in the form of a song to a loved one. Instead of sending a card, one could message a 30-second song about a loved one's birthday.

.

Categories