MegatonMegaton
News
Leaderboards
Top Models
Reviews
Products
Megaton MaskMegaton Mesh
Megaton
Menu
News
Leaderboards
  • Top Models
Reviews
Products
  • Megaton Mask
  • Megaton Mesh
Loading...
#1
Kling
Kling 2.6
#1
Google
Veo 3
#3
Google
Veo 3.1
#4
Google
Veo 2
#4
PixVerse
PixVerse v5.5
Top Models
Kling
Kling 2.6
1rank
Google
Veo 3
1rank
Google
Veo 3.1
3rank
Google
Veo 2
4rank
PixVerse
PixVerse v5.5
4rank

Regulation

YouTubers Launch Class Action Against ByteDance Over AI Training Data

January 19, 2026|By Megaton AI

Major creators including Ethan Klein are suing ByteDance and Meta for allegedly circumventing YouTube's security to scrape videos for AI model training.

YouTubers Launch Class Action Against ByteDance Over AI Training Data
Share

Major creators including Ethan Klein are suing ByteDance and Meta for allegedly circumventing YouTube's security to scrape videos for AI model training.

Ethan Klein's latest YouTube video doesn't feature his usual commentary format. The h3h3Productions creator sits in front of legal documents, explaining how ByteDance allegedly scraped his content to train its Magic Video AI model. "They used sophisticated tools to bypass YouTube's streaming protocols," Klein states in the December 27 video, holding up printed pages from the HD-VILA-100M dataset that he claims contains his work.

The class-action lawsuits, filed December 23 in the Northern District of California, mark a strategic shift in how creators are challenging AI companies. Rather than pursuing traditional copyright infringement claims, the plaintiffs are invoking Section 1201 of the Digital Millennium Copyright Act—the provision that makes it illegal to circumvent technological protection measures. This approach sidesteps thorny questions about fair use and focuses on the alleged methods used to obtain the training data.

According to the complaints reviewed by Bloomberg Law, ByteDance and Meta allegedly used automated tools like yt-dlp to download streaming-only content from YouTube. These tools work by intercepting the segmented video chunks that YouTube sends to browsers during playback, then reassembling them into downloadable files—converting stream-only content into permanent copies.

The distinction matters. YouTube's terms of service prohibit automated downloading, and the platform implements various technical measures to enforce streaming-only access. By allegedly circumventing these measures, the companies may have violated the DMCA even if the underlying content use might otherwise qualify as fair use.

Pascal's Substack legal analysis notes that this DMCA Section 1201 strategy echoes tactics used in early DVD decryption cases. "The plaintiffs aren't arguing about whether AI training constitutes fair use," the analysis explains. "They're arguing that the act of bypassing YouTube's access controls is itself illegal, regardless of what happens to the content afterward."

Ted Entertainment, another plaintiff in the ByteDance suit, alleges its YouTube videos appeared in datasets used to train proprietary video generation models. The company claims ByteDance's scraping operation was industrial in scale, targeting millions of hours of content.

Subscribe to our newsletter

Get the latest model rankings, product launches, and evaluation insights delivered to your inbox.

The timing appears deliberate. These filings bring the total number of AI-related copyright suits to over 70, according to Chat GPT Is Eating the World's tracking. The surge suggests creators are coordinating their legal strategies as video generation models become commercially viable.

ByteDance's Magic Video model, announced earlier this year, can generate 16-second clips from text prompts. Meta's video generation capabilities remain less public, though the company has demonstrated various AI video tools. Neither company responded to requests for comment about the lawsuits by press time.

The HD-VILA-100M dataset sits at the center of the ByteDance allegations. Klein and other plaintiffs claim this dataset, containing 100 million video clips, was built using scraped YouTube content. Similar allegations have been made against Nvidia in separate litigation.

These cases focus on method rather than outcome. Previous AI training lawsuits have struggled with courts' varying interpretations of transformative use. By targeting the scraping process itself, these plaintiffs may have found a more straightforward legal path.

The DMCA's anti-circumvention provisions carry statutory damages of up to $2,500 per violation. With millions of allegedly scraped videos, the potential damages could reach billions—though courts rarely award maximum statutory amounts.

Video creators may want to audit whether their content appears in known AI training datasets. The DMCA Section 1201 strategy could become a template for future creator lawsuits, and companies building video AI models may face pressure to demonstrate clean data provenance. YouTube's technical measures against scraping could also become legally significant precedents, while settlement negotiations might establish industry norms for creator compensation.

The Northern District of California will likely consolidate these cases given their similar claims and defendants. Discovery could reveal the actual scale and methods of any scraping operations—details that remain largely speculative based on public filings. The question is whether AI companies broke the law in how they obtained YouTube content.

Related Articles
TechnologyFeb 2, 2026

Google's Project Genie: The Promise of Interactive Worlds to Explore

The experimental AI prototype generates playable 3D environments from text prompts, triggering a 15% gaming stock selloff.

Read more
TechnologyFeb 2, 2026

Rise of the Moltbots

A brief glimpse into an internet dominated by synthetic AI beings.

Read more
TechnologyJan 26, 2026

Adobe's Firefly Foundry: The bet on ethically trained AI

Major entertainment companies are building custom generative AI models trained exclusively on their own content libraries, as Adobe partners with Disney, CAA, and UTA to address the industry's copyright anxiety.

Read more
BusinessJan 23, 2026

Memory Prices Double as AI Eats the World's RAM Supply

Data centers will consume 70% of global memory production this year, leaving everyone else scrambling for scraps at premium prices.

Read more
Megaton

Building blockbuster video tools, infrastructure and evaluation systems for the AI era.

General Inquiriesgeneral@megaton.ai
Media Inquiriesmedia@megaton.ai
Advertising
Advertise on megaton.ai:sponsorships@megaton.ai
Address

Megaton Inc
1301 N Broadway STE 32199
Los Angeles, CA 90012

Product

  • Features

Company

  • Contact
  • Media

Legal

  • Terms
  • Privacy
  • Security
  • Cookies

© 2026 Megaton, Inc. All Rights Reserved.