Researchers at Google have released plans for a new time-and-space diffusion model called Lumiere that will turn text or an image into a realistic AI-generated video, with capabilities for on-demand editing. 

Lumiere is designed to portray “realistic, diverse and coherent motion” through what it calls its “Space-Time U-Net architecture.” This instantly generates the whole duration of the video through a single pass of the model.

In the paper, the researchers explained:

“By deploying both spatial and (importantly) temporal down- and up-sampling and leveraging a pre-trained text-to-image diffusion model, our model learns to directly generate a full-frame-rate, low-resolution video by processing it in multiple space-time scale.”

This would mean users can input textual descriptions of what they want to see as a video or upload a still image with a prompt and generate a dynamic video.

Users have been making parallels to Lumiere being like ChatGPT but for text and image to video generation, stylization, editing, animation and more, according to the paper.

While other artificial intelligence video generators already exist, such as Pika and Runway, the researchers say their single-pass approach to temporal data dimension involved with video generation is novel.

Related: AI deepfakes fool voters and politicians ahead of 2024 US elections — ‘I thought it was real’

Hila Chefer, a student researcher who worked on the model with Google, posted an example of the model’s capabilities on social media platform X:

Users on X have been calling this development things like “an incredible breakthrough” and “state-of-the-art,” and even speculating that video generation is “gonna get crazy” in the next year.

Lumiere was trained on a data set of 30 million videos and text captions and has the capability to generate 80 frames at 16 frames per second. However, there has been no mention of the source of the data Google used to train the model — a heated topic in the world of AI and copyright law. 

Since the explosion of generative AI models available for public use, there have been dozens of copyright infringement-related lawsuits filed against developers for the alleged misuse of content during training. 

One of the most prominent cases was filed by The New York Times against Microsoft and OpenAI, the creator of ChatGPT, for allegedly “illegally” sourcing its work for training purposes.

Magazine: Crypto+AI token picks, AGI will take ‘a long time’, Galaxy AI to 100M phones: AI Eye