Illustration by Megaton
Image: Illustration by Megaton

Technology

Google announces Gemini Omni models for generating video from any input

By Julius RobertTuesday, May 19th 2026

The new AI family promises cross-modal generation capabilities

Share

Gemini Omni is a New Family of AI Models That Promise Cross-Modal Generation, but Details Are Sketchy

According to Engadget, Google recently announced Gemini Omni, which appears to be a new family of AI models that are intended to create content in a variety of formats. The main goal of this release is to have models that can create video, however, the company did not provide many technical specifics about what the models could do (other than create video) nor was there any information on when they would become available.

Subscribe to our newsletter

Get the latest model rankings, product launches, and evaluation insights delivered to your inbox.

Ambitious Claims of Cross-Modal Generation Capabilities

Google’s move towards creating more flexible content creation through Gemini Omni models moves away from single-format models (i.e., Text-To-Text or Image-To-Image). Instead, the new models claim to be capable of translating between multiple formats. Specifically, according to the reporting by Engadget, Gemini Omni models will be capable of creating video outputs; however, it is currently unknown if other formats (text, images, etc.) will also be accepted as input, and/or if the models will be capable of producing outputs in all formats.

It seems like Google is positioning the "Anything From Any Input" tagline as a way to promote a universal translation tool that accepts a wide array of inputs (video, text, images, etc.), and produces a corresponding number of output formats. Unfortunately, without knowing much about the actual implementation of the models, it is currently unknown how these cross-modal capabilities stack up to those offered by competitors’ tools or the same company’s prior Gemini releases.

Editorial illustration for Google announces Gemini Omni models for generating video from any input
create anything from any input,
Related Articles