Meta case adds to mounting legal and public pressure around unlicensed data use in AI development
Earlier this week, The Atlantic launched a searchable database for Library Genesis (LibGen), a leading repository of pirated books and media. The tool is part of its ongoing coverage of Meta’s alleged use of unauthorized data to train AI systems. The LibGen dataset has been widely referenced in legal filings and academic research as a source of scraped books used to train large language models.
The Atlantic’s searchable tool enables authors and the public to identify whether specific titles appear in the dataset. It also places further scrutiny on the legality and ethics of using such material in commercial AI development. Notably, The Atlantic itself is one of several publishers currently suing AI startup Cohere for copyright infringement, alleging the company used copyrighted articles without permission to train its models.
As part of this wave of investigative reporting, The Guardian reached out to several prominent Australian authors whose works appear in the LibGen dataset and may have been used by Meta for its AI models, including its leading open-source ‘Llama’ LLM. The Guardian highlights high-profile figures such as former prime ministers Malcolm Turnbull and John Howard, as well as authors Tracey Spicer and Holden Sheppard, who all called for stronger legislative protections and greater transparency from AI companies.
This lawsuit adds to a broader pattern unfolding globally in 2025, where courts, regulators, and publishers are confronting the foundational practices of the commercial AI stack.
Legal Momentum Is Accelerating in 2025
Several parallel cases help contextualize the situation and highlight how widespread the issue has become:
- In the U.S., Meta is currently defending itself in a copyright lawsuit filed by authors including Ta-Nehisi Coates and Sarah Silverman. In March 2025, Meta filed a motion to dismiss, arguing its use of copyrighted books to train AI models qualifies as fair use because the AI’s outputs are “non-infringing and transformative.” A ruling on that motion is pending.
- In a closely watched decision in February 2025, a U.S. federal judge ruled that Ross Intelligence infringed copyright when it used Westlaw’s headnotes — proprietary legal summaries owned by Thomson Reuters — to train an AI legal assistant. The court rejected Ross’s fair use defense, marking one of the first definitive rulings that training on copyrighted material can constitute infringement.
- Lawsuits from Dow Jones, News Corp, and the New York Post against Perplexity AI are also in progress, with plaintiffs alleging unlicensed use of proprietary news content. These cases further reflect a shift from informal complaints to formal legal action across multiple sectors.
As regulators, courts, and publishers push back, AI companies will need to shift from an extractive model to one that is permissioned by design. The commercial risk of ignoring this shift is growing. So is the opportunity to get it right.
Building Toward a Permissioned AI Economy
At Dappier, we’ve written previously about how these legal challenges signal the need for new infrastructure. In our coverage of The New York Times lawsuit against OpenAI and Microsoft, we noted that AI agents are becoming the dominant interface for content consumption — but the economic model hasn’t caught up. Publishers are being disintermediated, content is being used without licensing, and the existing web monetization stack is breaking down. This latest lawsuit reinforces the need to rebuild that stack around permission, attribution, and real-time monetization.
Dappier is building toward that vision with infrastructure that makes content AI-ready, permissioned, and monetizable at the point of inference.
The next generation of AI should not rely on legal ambiguity. It should be built on transparent economics that reward content creators for the value they generate in the AI ecosystem.
Make your content AI-ready, permissioned, and monetizable. Learn more at dappier.com or schedule a demo.
