Sora's Demise & The Brutal Economics of Video Diffusion

The Cost Curve Has No Floor

OpenAI shut down Sora. After spending an estimated $15 million a day ($5.4B per year) on video generation to focus on enterprise products and coding tools ahead of an IPO.

So what now?

Well. Unlike large language models, Video driven by diffusion models, have a steeper hill to climb than the LLMs which power GPT-5, Gemini or Claude.

But pending WW3, the cost curve of video will likely still collapse to $0.01 per minute across all video categories. There are many paths up to that mountain even if the path will be more difficult than LLMs.

Economics of Video Diffusion:

AI Systems like Sora work by starting with static noise and then gradually cleaning it up step by step until a coherent image appears. Other parts of the system deal with how to understand the world and so understand user input while others ensure the video doesn't fall apart from the start of the video till its end.

To pull this off requires tons of data and compute. Both to train the model and to run it for end users. That capital has to be allocated away from other models competing for the same compute.

In an era of cheap capital, cheap energy and abundant compute labs can afford to take more speculative bets and the promise of consuming the entire media pipeline promises huge riches for the labs and hyperscalers.

Subscribe to our newsletter

Get the latest model rankings, product launches, and evaluation insights delivered to your inbox.

However, in a constrained environment the ROI on LLMs per unit of compute outperforms video 60:1. Not only is the budget for labor in orgs and enterprises larger than that for video and media, but it has the virtuous property of aiding the development of the models themselves in a recursive positive loop.

Leadership in technology companies have already begun discussing "token budget" as interchangeable with "labor budget". AI replacement debate aside, the US service economy spends $12T on labor per year versus media's $2T.

Video Models with Chinese Characteristics

The Chinese labs face a fundamentally different competitive calculus. ByteDance and Kuaishou own distribution platforms that make video diffusion businesses viable. Douyin and Kuaishou together host billions of short videos that double as training data, hundreds of millions of creators who immediately put generated content to commercial use, and advertising ecosystems that monetize views. When a creator uses Kling to generate an e-commerce ad and attaches a product link, the platform shares revenue based on views and clicks. The video model feeds the content ecosystem which feeds the ad business which pays for the compute. The cost structure is also radically different. Chinese video models reportedly operate at one-sixth to one-tenth the inference cost of Sora for comparable output. Government computing centers directly subsidize research, with one facility in Changchun allocating 200 out of 300 petaflops of compute to a single video AI project. And the results show: Kling hit $240 million in annualized revenue by late 2025. Some Chinese video AI startups are approaching breakeven on subscription revenue alone, something no Western video AI company has come close to. Most critically, the opportunity cost argument doesn't apply the same way. OpenAI had to choose between GPU hours for video and GPU hours for enterprise code generation.

What Happens Now?

The situation is fluid. Cost of compute, technical breakthroughs all may make diffusion economically viable and nimble startups may triumph after all. Runway raised a gargantuan round and continues to release models but still fall short of the front runners.

My prediction:

Google and xAI's next generation models may push the frontier and with their balance sheets and access to their own compute and distribution platforms may be able to create viable vertically integrated platforms similar to their Chinese peers. Headwinds might come from the cost of energy, restricted compute or xAI's restructuring.

Chinese models will likely continue pulling further ahead with gated access to western markets pending the resolution on questions of IP & copyright.

Existing open source models continue to develop and get folded into existing pipelines. As they pull ahead, the economics change entirely if they can run on local machines.

Sora's Demise & The Brutal Economics of Video Diffusion

Real-Time AI Video Arrives at Game Engine Speed

OpenAI Reportedly Plans Sora Integration Into ChatGPT Amid Slumping Adoption

PixVerse Hits Unicorn Status as China's AI Video Race Heats Up

Trump blocks Anthropic from federal contracts over AI safety rules