The short version

On December 22, 2025, New York Times investigative reporter and Bad Blood author John Carreyrou filed a copyright lawsuit in the U.S. District Court for the Northern District of California against Anthropic, Google, OpenAI, Meta, Elon Musk’s xAI, and Perplexity. The six named plaintiffs say the companies trained and/or optimized their large language models using “pirated” copies of books from shadow libraries like LibGen and Z-Library, without permission or payment. The case—Carreyrou v. Anthropic PBC et al., No. 3:25‑cv‑10897—seeks statutory damages and an injunction, and notably is not a class action. Reuters | Complaint (PDF)

A federal courthouse scene with books morphing into code and neutral, generic symbols for AI models—no brand logos—conveying the clash between publishing and AI training
$1.5B
Largest AI‑training copyright settlementSource: Reuters-2025-12-18

What’s new—and why it matters

  • First suit to name xAI: Coverage of the filing emphasizes this is the first copyright case to list xAI as a defendant, reflecting how quickly newcomers are being pulled into the same legal thicket as incumbents. Reuters
  • Not a class action: Carreyrou and five other writers—Lisa Barretta, Philip Shishkin, Jane Adams, Matthew Sacks, and Michael Kochin—are pursuing individual claims, arguing class deals undervalue authors’ rights (they point to an estimated ~$3,000 per work in the Anthropic class settlement as “about 2%” of the statutory maximum). Complaint | Reuters
  • The allegation: defendants downloaded and copied “gold‑standard” book content from shadow libraries (LibGen, Z‑Library, OceanofPDF, and Books3) to train or tune chatbots and related products. Complaint

The legal landscape the case drops into

Two Northern District of California decisions from mid‑2025 set important (if contested) waypoints:

  • Kadrey v. Meta (June 25, 2025): Judge Vince Chhabria found Meta’s book‑based training of Llama to be “transformative” fair use on the record before him, while stressing the ruling didn’t bless AI training in general and leaving other claims (like alleged torrenting) unresolved. CNBC | The Verge
  • Bartz v. Anthropic (Oct–Dec 2025): The parties reached a $1.5B class settlement later scrutinized over fees; the court indicated that while some training can be fair use, downloading from pirate sources could still matter for liability or damages. Reuters

At the same time, publishers’ cases continue. The New York Times’ lawsuit against OpenAI and Microsoft survived largely intact in March 2025, keeping fair‑use questions headed toward trial in New York. Associated Press via Inquirer | OpenAI’s case page

What the companies are likely to argue

Most AI developers maintain that training on publicly available materials is fair use, often highlighting opt‑outs or licenses:

  • OpenAI says training on publicly available data is fair use and points to publisher opt‑outs it honors. OpenAI
  • Google highlights its Google‑Extended robots.txt control to let sites opt out of AI training for Gemini/Vertex. Google Blog
  • Meta has asserted a fair‑use defense in author suits and won a key ruling in June 2025. CNBC
  • Perplexity told Reuters it “doesn’t index books,” while facing separate publisher suits over news content and RAG‑style outputs. Reuters | Loeb & Loeb case note

Case details at a glance

Carreyrou v. Anthropic PBC et al. — snapshot

FieldDetail
CourtU.S. District Court, Northern District of California
CaseCarreyrou v. Anthropic PBC et al.
Number3:25‑cv‑10897
FiledDecember 22, 2025
PlaintiffsJohn Carreyrou; Lisa Barretta; Philip Shishkin; Jane Adams; Matthew Sacks; Michael Kochin
DefendantsAnthropic; Google; OpenAI (and affiliates); Meta; xAI; Perplexity
Core claimDirect copyright infringement (17 U.S.C. §501) for copying/using books from shadow libraries to train or optimize LLMs
Relief soughtStatutory damages (up to $150,000 per work, per defendant, for willful infringement), injunction, fees
NotablePlaintiffs declined class treatment, citing low per‑work payouts in other settlements

What this could mean for AI builders and buyers

Beyond the headline, the complaint challenges the “data supply chain” that underpins generative AI. If courts accept that sourcing from shadow libraries reflects willful infringement—even where certain training might be deemed “transformative”—exposure could multiply across models, checkpoints, and products. That risk profile encourages:

  • Verified licensing and provenance for book‑length text, not just web crawl data.
  • Documentation of training inputs, fine‑tunes, and retrieval indexes separate from “core” pretraining.
  • Product‑level mitigations against regurgitation and quotation of long passages (which weigh against fair use).
  • Broader adoption of machine‑readable control/consent signals (e.g., Google‑Extended) and emerging licensing standards.

The open questions a jury could decide

  • Does the source of training data matter? Early rulings suggest some training may be fair use; sourcing from pirate sites could still affect liability and willfulness damages.
  • How much regurgitation is too much? Academic work continues to probe when models memorize and emit protectable text, a factor that can weigh against fair use. (See, e.g., emerging memorization studies from 2025.)
  • What counts as “optimization” vs. “training”? The complaint also targets ingestion steps like preprocessing and retrieval‑augmented generation caches.

What to watch next

  • Motions to dismiss/transfer: Expect threshold challenges and early fights over discovery scope and dataset disclosures.
  • Interplay with other cases: NDCA’s Meta and Anthropic precedents will loom large; the New York Times case will shape the narrative on news content.
  • The market response: More publishers are deploying robots.txt controls (e.g., Google‑Extended) and exploring licensing frameworks, while model providers weigh re‑training costs versus settlement risk. Google Blog

Sources

  • Reuters: “New York Times reporter sues Google, xAI, OpenAI over chatbot training” (Dec 22, 2025). Link
  • Complaint, Carreyrou v. Anthropic PBC et al., 3:25‑cv‑10897 (N.D. Cal. Dec 22, 2025). PDF
  • Reuters: “Anthropic asks judge to slash legal fees in $1.5 billion settlement” (Dec 18, 2025). Link
  • CNBC: “Judge rules Meta’s use of books to train AI is fair use” (June 25, 2025). Link
  • The Verge: Analysis of Meta ruling and fair use caveats (June 2025). Link
  • OpenAI: “OpenAI and journalism” (policy/stance on training and opt‑outs). Link
  • Google Blog: “An update on web publisher controls” (Google‑Extended). Link
  • Loeb & Loeb: Dow Jones & Co. v. Perplexity AI (case summary). Link