What happened

The New York Times filed a federal lawsuit against Perplexity on Friday, December 5, 2025, in the Southern District of New York, alleging the AI startup “copied, distributed, and displayed” millions of Times articles without permission to power its products. The complaint also claims Perplexity’s tools produced fabricated content (“hallucinations”) that appeared alongside Times trademarks, and that the company scraped from paywalled sections of nytimes.com. The Times seeks damages and an injunction.
Per The Verge’s review of the filing (which also links the complaint), the Times argues Perplexity served responses that were “verbatim or substantially similar” to its stories and ignored website protections such as robots.txt. A copy of the filing is posted on DocumentCloud.
What the Times alleges—and why it matters
The lawsuit centers on two fronts: copyright and consumer confusion. On copyright, the Times says Perplexity reproduced protected text, including from behind its paywall, rather than paraphrasing or linking out. On trademarks, the paper argues Perplexity’s answers sometimes fabricated details while displaying Times branding, risking false association under U.S. trademark law. If a court agrees, it could force stricter lines around how AI answer engines quote, summarize, and attribute journalism.
Perplexity’s response: ‘user-driven agents,’ citations, and new deals
Perplexity says it doesn’t scrape to build foundation models; instead, it “indexes publicly available web pages” and cites sources in real time. The company has framed its activity as user-initiated retrieval (RAG) rather than bulk copying, and it dismissed the Times suit as another example of publishers resisting new technologies.
In August, Perplexity published a detailed rebuttal to Cloudflare’s accusations that it used “stealth” crawlers to skirt blocks, arguing that Cloudflare misattributed unrelated traffic and that Perplexity’s agents fetch content only when a user asks—“agents, not bots.”
Perplexity has also tried to calm publisher concerns through commercial partnerships. It launched a Publishers’ Program (2024) that shares ad revenue and analytics with participating outlets (e.g., Time, Der Spiegel, Fortune, Los Angeles Times, The Independent); in August 2025 it rolled out Comet Plus, pledging 80% of that $5 subscription to participating publishers and setting aside a $42.5 million fund for early partners. In October 2025, it signed a multi‑year licensing deal with Getty Images to display licensed visuals with attribution.
The wider backdrop: lawsuits, licensing—and a line in the sand
The Times–Perplexity clash is the latest in a wave of publisher actions testing how AI systems may use the news:
- Chicago Tribune sued Perplexity on Thursday, December 4, 2025, alleging the service reproduces Tribune content verbatim and uses RAG to pull from paywalled pages.
- News Corp’s Dow Jones and the New York Post sued Perplexity in October 2024, calling its practices a “massive amount of illegal copying.”
- Britannica and Merriam‑Webster sued Perplexity in September 2025 for copyright and trademark infringement.
- Reddit sued Perplexity (and others) on October 22, 2025, alleging “industrial‑scale” scraping.
- Separately, Cloudflare reported that Perplexity’s crawlers evaded robots.txt via “stealth” tactics; Perplexity disputes the findings.
Meanwhile, some AI firms are choosing licensing over litigation. OpenAI’s content deals now include the Financial Times, News Corp, Vox Media and others; Meta announced fresh agreements with major U.S. outlets; and in May 2025, the Times itself licensed content to Amazon for Alexa and model training—evidence that publishers will partner when the terms are right.
Recent publisher actions involving Perplexity
| Plaintiff | Filed | Forum/Status | Core allegation(s) |
|---|---|---|---|
| New York Times | Dec 5, 2025 | SDNY (filed) | Verbatim copying, paywall scraping, trademark confusion |
| Chicago Tribune | Dec 4, 2025 | Federal court in NY (filed) | Verbatim copying; RAG drawing from paywalled content |
| Oct 22, 2025 | Federal court (filed) | Industrial‑scale scraping of user comments | |
| Britannica & Merriam‑Webster | Sept 10, 2025 | SDNY (pending) | Copyright + trademark infringement |
| Dow Jones & New York Post | Oct 21, 2024 | SDNY (pending) | Massive unauthorized copying |
How this could reshape AI search and automation
For AI answer engines that rely on retrieval‑augmented generation, the Times case spotlights three design pressures:
- Quotation vs. synthesis. Courts may push systems away from long verbatim excerpts toward tighter synthesis with clear linking—and stronger safeguards against “substitutive” outputs that replace a visit to the source.
- Respect for publisher signals. Even though robots.txt isn’t an access control, ignoring it invites technical blocks, reputational damage, and legal risk when combined with copying claims. Expect wider use of bot‑blocking services and paywall hardening.
- Attribution and licensing. Users want instant answers, but publishers need traffic or fees. More engines may follow a hybrid path—capped snippets plus paid licensing for richer excerpts and media. Recent deals by OpenAI, Meta, and Amazon‑NYT point that way.
TipPractical guardrails for teams building RAG and AI search
- Enforce strict anti‑copying thresholds (e.g., hard caps on contiguous characters and overall excerpt length) and prefer sentence‑level paraphrase.
- Heed robots.txt; implement allowlists/denylists and log every fetch by user and domain; honor paywall signals.
- Add citation‑first UX with prominent links and publisher logos only when licensed; avoid trademarked branding in UI near unverified claims.
- Where critical, license the sources or partner via revenue‑share; otherwise filter out blocked or high‑risk domains.
- Run pre‑publish hallucination tests and brand‑safety checks; quarantine outputs when source confidence is low.
What to watch next
- Early motions: Perplexity is likely to move to dismiss or oppose any request for preliminary injunction; hearings could clarify how courts see RAG’s use of news text versus fair use.
- Discovery on crawling: Expect deeper scrutiny of bot behavior, IP ranges, and how Perplexity gates paywalled content—issues raised in prior Cloudflare and media reports.
- More deals, more splits: As lawsuits proceed, more publishers may sign licensing deals with AI platforms—while others double down on litigation to set precedent.
Sources
- Reuters, “New York Times sues Perplexity AI for ‘illegal’ copying of content.” Link.
- The Verge, “The New York Times sues Perplexity for producing ‘verbatim’ copies of its work.” Link.
- NYT v. Perplexity complaint (DocumentCloud). Link.
- TechCrunch, “Chicago Tribune sues Perplexity.” Link.
- CNBC, “Dow Jones and New York Post sue Perplexity AI.” Link.
- Britannica corporate site, “Britannica Files Copyright and Trademark Infringement Lawsuit Against Perplexity.” Link.
- AP via ABC News, “Reddit sues over ‘industrial‑scale’ scraping of user comments.” Link.
- Cloudflare blog, “Perplexity is using stealth, undeclared crawlers…” Link. Perplexity response: “Agents or Bots?” Link.
- Axios, “Perplexity’s Comet Plus subscription.” Link.
- TechCrunch/GlobeNewswire, “Perplexity–Getty Images deal.” Link.
- OpenAI–FT partnership (OpenAI). Link. Vox Media–OpenAI. Link.
- Meta’s news licensing deals (The Verge). Link. Amazon–NYT licensing (CNBC). Link.