Five leading publishers - Elsevier, Cengage, Hachette, Macmillan and McGraw Hill - together with author Scott Turow filed a proposed class-action lawsuit in Manhattan federal court on Tuesday, accusing Meta Platforms of using their copyrighted books and journal articles without authorization to train its Llama artificial intelligence model.
The complaint alleges Meta copied and processed millions of works - ranging from textbooks and scientific papers to novels for the purpose of training large language models that respond to human prompts. The plaintiffs contend those works were used en masse without permission, a practice they describe as pirating material to develop AI capabilities.
According to the filing, examples of specific works allegedly included in the dataset used to train the model are "The Fifth Season" by N.K. Jemisin and "The Wild Robot" by Peter Brown. The publishers have asked the court to permit them to represent a broader class of copyright owners and are seeking monetary damages, though the complaint does not specify an exact amount.
Meta did not immediately respond to a request for comment on Tuesday.
Maria Pallante, president of the Association of American Publishers, is quoted in the complaint materials criticizing the conduct it describes: "Meta’s mass-scale infringement isn’t public progress, and AI will never be properly realized if tech companies prioritize pirate sites over scholarship and imagination."
Legal context and wider disputes
The lawsuit marks a fresh escalation in an ongoing legal dispute between creative owners and technology firms over the use of copyrighted material to train AI systems. Dozens of different plaintiffs - including authors, news organizations and visual artists - have brought suits against multiple AI companies, alleging similar patterns of unauthorized ingestion of copyrighted content.
Central to these cases is whether the use of copyrighted works to train AI qualifies as fair use, with courts asked to determine whether such training is sufficiently transformative to avoid infringement. Prior rulings in related litigation have diverged: two judges addressing the question issued conflicting decisions in the prior year, underscoring the unsettled legal landscape.
The complaint also notes that Anthropic, a company backed by Amazon and Google, resolved one of the earlier class actions by agreeing to a settlement payment of $1.5 billion to a group of authors.
What the complaint seeks and immediate implications
The publishers are pursuing permission to act on behalf of a larger class of copyright holders and unspecified monetary relief. The filing signals that established creators and rights holders are pushing aggressively to assert control over how their works are used in AI development.
How courts ultimately interpret fair use in the context of AI training datasets will determine whether this suit and similar cases result in monetary awards, licensing regimes, or altered practices in how AI developers assemble training data. For now, the litigation opens another front in a legal debate that remains unresolved.