The short version
On December 22, 2025, New York Times investigative reporter and Bad Blood author John Carreyrou filed a copyright lawsuit in the U.S. District Court for the Northern District of California against Anthropic, Google, OpenAI, Meta, Elon Musk’s xAI, and Perplexity. The six named plaintiffs say the companies trained and/or optimized their large language models using “pirated” copies of books from shadow libraries like LibGen and Z-Library, without permission or payment. The case—Carreyrou v. Anthropic PBC et al., No. 3:25‑cv‑10897—seeks statutory damages and an injunction, and notably is not a class action. Reuters | Complaint (PDF)

What’s new—and why it matters
- First suit to name xAI: Coverage of the filing emphasizes this is the first copyright case to list xAI as a defendant, reflecting how quickly newcomers are being pulled into the same legal thicket as incumbents. Reuters
- Not a class action: Carreyrou and five other writers—Lisa Barretta, Philip Shishkin, Jane Adams, Matthew Sacks, and Michael Kochin—are pursuing individual claims, arguing class deals undervalue authors’ rights (they point to an estimated ~$3,000 per work in the Anthropic class settlement as “about 2%” of the statutory maximum). Complaint | Reuters
- The allegation: defendants downloaded and copied “gold‑standard” book content from shadow libraries (LibGen, Z‑Library, OceanofPDF, and Books3) to train or tune chatbots and related products. Complaint
The legal landscape the case drops into
Two Northern District of California decisions from mid‑2025 set important (if contested) waypoints:
- Kadrey v. Meta (June 25, 2025): Judge Vince Chhabria found Meta’s book‑based training of Llama to be “transformative” fair use on the record before him, while stressing the ruling didn’t bless AI training in general and leaving other claims (like alleged torrenting) unresolved. CNBC | The Verge
- Bartz v. Anthropic (Oct–Dec 2025): The parties reached a $1.5B class settlement later scrutinized over fees; the court indicated that while some training can be fair use, downloading from pirate sources could still matter for liability or damages. Reuters
At the same time, publishers’ cases continue. The New York Times’ lawsuit against OpenAI and Microsoft survived largely intact in March 2025, keeping fair‑use questions headed toward trial in New York. Associated Press via Inquirer | OpenAI’s case page
What the companies are likely to argue
Most AI developers maintain that training on publicly available materials is fair use, often highlighting opt‑outs or licenses:
- OpenAI says training on publicly available data is fair use and points to publisher opt‑outs it honors. OpenAI
- Google highlights its Google‑Extended robots.txt control to let sites opt out of AI training for Gemini/Vertex. Google Blog
- Meta has asserted a fair‑use defense in author suits and won a key ruling in June 2025. CNBC
- Perplexity told Reuters it “doesn’t index books,” while facing separate publisher suits over news content and RAG‑style outputs. Reuters | Loeb & Loeb case note
Case details at a glance
Carreyrou v. Anthropic PBC et al. — snapshot
| Field | Detail |
|---|---|
| Court | U.S. District Court, Northern District of California |
| Case | Carreyrou v. Anthropic PBC et al. |
| Number | 3:25‑cv‑10897 |
| Filed | December 22, 2025 |
| Plaintiffs | John Carreyrou; Lisa Barretta; Philip Shishkin; Jane Adams; Matthew Sacks; Michael Kochin |
| Defendants | Anthropic; Google; OpenAI (and affiliates); Meta; xAI; Perplexity |
| Core claim | Direct copyright infringement (17 U.S.C. §501) for copying/using books from shadow libraries to train or optimize LLMs |
| Relief sought | Statutory damages (up to $150,000 per work, per defendant, for willful infringement), injunction, fees |
| Notable | Plaintiffs declined class treatment, citing low per‑work payouts in other settlements |
What this could mean for AI builders and buyers
Beyond the headline, the complaint challenges the “data supply chain” that underpins generative AI. If courts accept that sourcing from shadow libraries reflects willful infringement—even where certain training might be deemed “transformative”—exposure could multiply across models, checkpoints, and products. That risk profile encourages:
- Verified licensing and provenance for book‑length text, not just web crawl data.
- Documentation of training inputs, fine‑tunes, and retrieval indexes separate from “core” pretraining.
- Product‑level mitigations against regurgitation and quotation of long passages (which weigh against fair use).
- Broader adoption of machine‑readable control/consent signals (e.g., Google‑Extended) and emerging licensing standards.
The open questions a jury could decide
- Does the source of training data matter? Early rulings suggest some training may be fair use; sourcing from pirate sites could still affect liability and willfulness damages.
- How much regurgitation is too much? Academic work continues to probe when models memorize and emit protectable text, a factor that can weigh against fair use. (See, e.g., emerging memorization studies from 2025.)
- What counts as “optimization” vs. “training”? The complaint also targets ingestion steps like preprocessing and retrieval‑augmented generation caches.
What to watch next
- Motions to dismiss/transfer: Expect threshold challenges and early fights over discovery scope and dataset disclosures.
- Interplay with other cases: NDCA’s Meta and Anthropic precedents will loom large; the New York Times case will shape the narrative on news content.
- The market response: More publishers are deploying robots.txt controls (e.g., Google‑Extended) and exploring licensing frameworks, while model providers weigh re‑training costs versus settlement risk. Google Blog
Sources
- Reuters: “New York Times reporter sues Google, xAI, OpenAI over chatbot training” (Dec 22, 2025). Link
- Complaint, Carreyrou v. Anthropic PBC et al., 3:25‑cv‑10897 (N.D. Cal. Dec 22, 2025). PDF
- Reuters: “Anthropic asks judge to slash legal fees in $1.5 billion settlement” (Dec 18, 2025). Link
- CNBC: “Judge rules Meta’s use of books to train AI is fair use” (June 25, 2025). Link
- The Verge: Analysis of Meta ruling and fair use caveats (June 2025). Link
- OpenAI: “OpenAI and journalism” (policy/stance on training and opt‑outs). Link
- Google Blog: “An update on web publisher controls” (Google‑Extended). Link
- Loeb & Loeb: Dow Jones & Co. v. Perplexity AI (case summary). Link