AI Video Takes a Big Leap Forward - Pika Labs Adds Lip Sync

AI Video Takes a Big Leap Forward - Pika Labs Adds Lip Sync

Pika Labs, one of the leading AI video platforms, has added a new feature that allows users to give voices to generated characters

Lip Sync, built in partnership with AI audio platform ElevenLabs, can give words to people in generated videos and sync their lip movements to the sound

If filmmakers want characters in a generated video to speak, they must either accept that they have no lip movements or cross the actual actor with the generated clip

Lip Sync changes that This new tool is an important moment in the realm of generative AI video I would argue that if properly implemented and if the initial problems are solved, it will be as big a deal as the announcement of Sora in OpenAI

Until now, most video clips generated by artificial intelligence have been little more than clips showing a scene, person, or situation There was no interactivity, with characters talking to the camera or to anyone on screen

Without the ability for a realistic character to talk to the audience, most videos have become glorified slideshows or used in music videos

I have done both, as well as created fictional trailers for TV shows and commercials

Lip-sync is currently only available to users on the Pro plan and above, so I haven't tried it myself yet, but from what I've seen from other generations, it's not perfect, but very close to production ready At the very least, it would be an inexpensive way to get a pilot off the ground quickly

This feature allows for text to speech conversion from the audio provided by Eleven Labs, or direct upload of audio if you already have your own audio, such as podcasts or books

Similar functionality is already offered by tools such as Synthesia, which is more focused on enterprise customer service and generates talking heads rather than characters

Runway and Pika Labs have been the dominant platforms for true generative video in recent months; Runway announced a synthetic voice-over service last year, but it is not synchronized with video

However, with all the major companies exploring generative video and OpenAI unveiling its very impressive Sora AI video platform, the competition is starting to heat up

StabilityAI also has a new version of Stable Video Diffusion, Leonardo offers motion for AI-generated images, Google has Lumiere, and Meta has Emu, so the early players, forced to add new features before everyone else catches up

Until now, generative AI has been siloed Tools to create images, tools to create video, services to write scripts, and something else to add sound The next phase will be a higher level of convergence, with platforms emerging that offer complete end-to-end production from simple text prompts

Eleven Labs is also working on a sound effects library, and combined with Suno, a single platform that can "turn this script written by ChatGPT into a short film" may soon emerge

Within minutes, a series of videos will appear on the timeline, with characters speaking using Eleven Labs' synthesized voice and appropriate sound effects and music, bringing the entire piece to life

There have been fears that AI would become like Skynet and take control of our lives, but the evidence (so far) seems to suggest that AI just wants to be entertained

Categories