

Artificial intelligence has not only completely changed the process of generating, examining, and distributing content, but has also sparked legal and ethical debates. Recently, prominent writers filed lawsuits against major AI companies, including OpenAI, Meta, and Google. The allegations are quite serious, as unauthorized use of copyrighted texts in training AI models has become a global concern.
The lawsuits filed in the US District Court for the Northern District of California show the conflicts between protecting intellectual property and fostering innovation. It is said that AI developers pirated books from online "shadow libraries,” raising questions about fair use, authors’ control, and the need for ethical, licensed data sourcing.
Plaintiffs argue that the defendants illegally accessed copyrighted books from so-called shadow libraries, such as LibGen, Z-Library, and OceanofPDF, and then used those works to develop large language models.
The lawsuits allege that the corporations copied, studied, and incorporated protected content into AI systems without compensating the creators or obtaining their rights. The allegations of intentional copyright infringement sparked an immediate global debate.
The suits are led by authors such as John Carreyrou, the author of “Bad Blood” and a New York Times reporter, along with Philip Shishkin, Lisa Barretta, Jane Adams, Matthew Sacks, and Michael Kochin. The filings allege specific works were downloaded illegally and form part of training datasets for commercial AI chatbots and generative models.
Rather than seeking class‑action status, plaintiffs in these cases are pursuing individual claims to recover larger potential damages for each infringed work, including statutory awards, restitution, attorney fees, and permanent injunctions against further use. The lawsuits demand jury trials on claims of direct infringement and deliberate copyright violation.
Representatives for OpenAI, Meta, Google, Anthropic, xAI, and Perplexity did not immediately comment on the complaints when approached by media outlets. These cases add to a broader wave of litigation challenging how AI companies source training data, with some previous suits tying into fair use debates and the legality of datasets.
The issue of pirated material in AI training is not new; a 2025 settlement saw Anthropic agree to pay authors about $1.5 billion over claims it downloaded millions of pirated books for its chatbot training. Other suits have targeted companies like Adobe for allegedly using pirated texts in smaller language models.
In previous legal battles, defendants have sometimes argued that training on copyrighted works qualifies as fair use under US copyright law, a defence that courts have examined in separate cases. Meta’s past legal filings claim fair use and defend disputed datasets by focusing on technical aspects, such as torrenting, rather than on distribution.
Authors and copyright advocates argue that unlicensed use of literary works for AI training undermines their ability to control and profit from their creations. A key concern centers on whether AI companies will pay fair compensation for training data or simply rely on publicly accessible but unauthorised sources of copyrighted content.
Also Read: Indian News Agency ANI Files Lawsuit Against OpenAI Over Copyright Violations
The rise in legal actions against AI companies reflects growing tension between technological advancement and intellectual property protection. As authors challenge the use of pirated books in AI training, the outcomes of these cases could help define the boundaries of fair use in the digital age.
Moreover, such lawsuits remind AI developers of the need to use data ethically, under proper licenses. The court rulings may soon create a framework that fairly compensates creators, strengthening a creative ecosystem grounded in both the law and content owners' rights.