Runway, a Google-backed startup, is under investigation for allegedly using YouTube videos without permission to train its AI video generation tool, Gen-3 Alpha.
The company, valued at $1.5 billion, reportedly used proxies and web crawlers to download videos, raising serious concerns about intellectual property rights.
The controversy highlights the ongoing issue of copyright infringement in AI development, with YouTube’s CEO warning that such practices violate the platform’s terms of service.
Was that a long read? Let me explain it more simply…
Next article Runway accused of using pirated content to train its AI
What is the story
Runway, a Google-backed AI startup, is facing accusations that it is using pirated content and unauthorized YouTube videos to train its Gen-3 Alpha video generation tool. The allegations emerged from leaked internal documents obtained by 404 Media and reportedly shared by a former Runway employee. The documents outline a plan to categorize and tag content from more than 3,900 YouTube channels, including major media companies like Disney and Netflix, and popular creators like Casey Neistat and Marques Brownlee (MKBHD).
Gen-3 Alpha Gains Attention Amid Controversy
Runway’s Gen-3 Alpha video generation tool gained a lot of attention last month for its ability to generate nearly photorealistic clips. The company said the tool was “jointly trained on videos and images,” but did not reveal the data source. Despite not acknowledging the authenticity of the leaked spreadsheet, Runway previously claimed it used a “curated in-house dataset” for training. However, 404 Media has successfully used the tool to create compelling videos of well-known YouTube personalities.
Alleged use of proxies and large web crawlers
Runway reportedly went so far as to cover its tracks by using proxies to avoid being blocked by YouTube. “That spreadsheet channel was a company-wide effort to find good quality videos to build a model on,” an anonymous former employee told 404 Media. “This was then used as input to a massive web crawler to download every video from every channel while using proxies to avoid being blocked by Google,” the employee added.
Intellectual Property Concerns in AI Training
This isn’t the first time an AI company has come under scrutiny for using copyrighted material without the necessary licenses. Earlier this year, OpenAI CTO Mira Murati admitted in an interview with The Wall Street Journal that it was unclear whether the training data for the company’s upcoming Sora video generator included videos from Instagram, YouTube, and Facebook. The New York Times then reported that OpenAI had circumvented company policies to circumvent copyright law and use a tool that transcribes YouTube videos to train an AI chatbot.
YouTube CEO warns of platform violations
YouTube CEO Neal Mohan has warned AI companies that using YouTube videos to train AI models is a “clear violation” of the platform’s terms of service. Intellectual property infringement issues remain a major obstacle in the development of generative AI, especially for models that can generate entire videos. Runway, which is valued at $1.5 billion, raised $141 million last year from investors including YouTube owner Google, NVIDIA and Salesforce.