Physics

Silicon Valley Controversy: Meta Allegedly Using Pirated Porn to Train AI

Artificial intelligence feeds on data like humans do on knowledge. What makes them more powerful every day are the images, texts, videos that we give them to analyze, often en masse and in the shadows. But what happens when these sources turn out to be contentious, or even illegal? This is precisely what the suspicious downloads from Meta suggest, which several rights holders accuse of having drawn from a repertoire as sensitive as it is controversial to train its video models.

Suspicious downloads from Meta raise alert

It all starts with cross-searching. After discovering suspicious activity in another complaint involving Meta and several book authors, the companies Strike 3 Holdings and Counterlife Media are searching their databases. To their great surprise, they claim to find 47 IP addresses belonging to Meta, associated with the downloading of 157 of their films between 2018 and 2025. The number seems modest, but the rights holders believe that the content would have been used to train the company's video models. They are seeking $359 million for this alleged infringement of their copyright, according to TorrentFreak.

The heart of the accusation lies in the idea that Meta intentionally used these files to feed Movie Gen, its AI video generator. To reinforce this hypothesis, the plaintiffs argue that a parallel network of more than 2,500 “masked” IP addresses would have made it possible to hide massive downloads. Added to this are the files detected on the IP address of a former Meta contractor, attached to his father's home. For Strike 3, these correlations in content types and periods of activity would constitute clues to an organized system.

Meta dismantles an accusation deemed absurd

Faced with this attack, Meta requests the pure and simple suspension of the complaint. The company rejects any direct involvement in these uploads, saying there is no evidence linking its teams or models to the use of adult content. The company highlights a key point. The facts span seven years, while its research projects in video AI only really began in 2022. It therefore seems inconsistent that the downloads were intended to train algorithms, especially when most of the files concerned were downloaded well before the existence of Movie Gen or LLaMA 4.

In an official document filed with the Federal District Court for the Northern District of California, Meta develops his argument on solid legal grounds. The group relies in particular on the Cobbler Nevada case law, which affirms that an IP address is not sufficient to prove that a person or entity is responsible for it. The company points out that its networks are used daily by thousands of employees, service providers, visitors or unidentified third parties. She also argues that the volume of files involved (around 22 per year) in no way corresponds to the gigantic databases usually used to train AI. Ars Technica reports that Meta calls the entire accusation theory “baseless” and “incoherent.”

The company also highlights a decisive detail. Its own terms of service explicitly prohibit generating or using explicit content via its AIs. Therefore, even if videos of this type had been uploaded, its systems would not be able to integrate them in any way. As for the files found with the contract worker, Meta asserts that there is no evidence to show that he obtained them in the course of his duties. The fact that the downloads stopped after the end of her contract is not enough, according to her, to implicate the company.

In the background, growing concern over data traceability

Far from being reduced to a simple dispute between an X studio and a digital giant, the case raises a fundamental question. How far can we trace the content that feeds artificial intelligence? If the cases of piracy are proven, they could compromise the legal validity of many models. And if, on the contrary, justice confirms that the simple use of an IP is not enough to establish a fault, the rights holders could lose strategic leverage in their future battles.

This trial highlights the persistent vagueness surrounding the regulation of AI. Who actually controls where the data comes from? And how can you demonstrate, in the middle of an ocean of downloaded files, that content has been used to train a model? And above all, who bears responsibility when a company like Meta allows tens of thousands of people to freely access its network?

As copyright infringement lawsuits multiply against AI giants, the outcome of this complaint could set a precedent. Perhaps it will draw the line between passive neglect and intentional training. And will ultimately determine the legal boundary of what machines have the right to learn.

With an unwavering passion for local news, Christopher leads our editorial team with integrity and dedication. With over 20 years’ experience, he is the backbone of Wouldsayso, ensuring that we stay true to our mission to inform.

More news

Berlin’s Unsold Christmas Trees Repurposed to Nourish Zoo Elephants

Even after the holidays, the Christmas spirit continues to be felt at Berlin Zoo. To the delight of the park animals, it was time ...

Concerned About Authoritarian Trends, Researchers Are Leaving OpenAI in Droves

When technologies advance at full speed, transparency becomes just as essential as innovation. In the field of artificial intelligence, it is sometimes the researchers ...

Resurrected from the Depths: The French Submarine Le Tonnant, Lost in 1942, Unearths a Forgotten Chapter of WWII off Spain’s Coast

For more than eight decades, Le Tonnant existed only in military reports and family memories. Scuttled in the chaos of the Second World War, ...

Leave a Comment Cancel reply