Gallery

Contacts

405 W. Greenlawn Ave Lansing, Michigan 48910

contact@techjacksolutions.com

+1-616-320-4064

Skip to content
Anthropic Regulation
Regulation Daily Brief

Shakespeare v. Anthropic: 100 Authors Sue Over Data Torrenting, Not AI Training

2 min read Pascal's Substack Partial Very Weak
One hundred authors who rejected the Bartz v. Anthropic settlement have filed a new copyright lawsuit targeting how Anthropic allegedly obtained training data, not what the company did with it. The procurement theory, if courts accept it, bypasses every fair use argument Anthropic has built.
Opt-out authors filing, 100

Key Takeaways

  • 100 Bartz opt-out authors filed a new copyright suit against Anthropic targeting dataset procurement via BitTorrent, not model training, a legally distinct theory
  • The complaint reportedly alleges use of Books3, LibGen, and PiLiMi datasets obtained via
  • BitTorrent; allegations have not been confirmed against the complaint text
  • Plaintiffs reportedly seek up to $71.4 million in maximum statutory damages, figure is unconfirmed without verified complaint access
  • If the procurement theory succeeds, fair use arguments built around transformative training use become legally irrelevant for this class of claim

Verdict

New copyright complaint filed targeting dataset torrenting, not model training
CourtN.D. Cal. (docket number unconfirmed)
Date2026-06-17
ImplicationsProcurement-phase infringement theory bypasses transformative use defense

Verification

Partial Newsletter reporting (Pascal's Substack; Chat GPT Is Eating the World) Docket number not confirmed; $71.4M damages figure and dataset list not verified against complaint text

A new lawsuit filed in the U.S. District Court for the Northern District of California on or around June 17, 2026 targets Anthropic on copyright grounds, but the legal theory is different from anything that has come before it in AI copyright litigation. The plaintiffs aren’t arguing about whether training AI models on copyrighted books is transformative. They’re arguing that Anthropic committed infringement before the first model ever trained, by torrenting and retaining pirated datasets.

Lead plaintiff Thomas William Shakespeare, described in reporting as a British sociologist and bioethicist, heads a group of approximately 100 authors, according to Pascal’s Substack. All are opt-outs from the Bartz v. Anthropic class action settlement. Filing a new suit rather than accepting the settlement terms signals these plaintiffs believe they have a stronger theory than the one Bartz pursued.

The procurement theory, as characterized by legal analysts covering the case, focuses on the alleged copying and retention of three datasets: Books3, LibGen, and PiLiMi, reportedly obtained via BitTorrent. The argument, as analyzed in legal commentary, is that downloading and storing pirated material is itself an act of infringement, separate from and prior to any training use. Under this framing, Anthropic’s position that model training is transformative doesn’t matter. The infringement allegedly occurred at the point of acquisition.

The damages figure circulating in reporting is up to $71.4 million in maximum statutory damages – reportedly the result of applying the $150,000 willful infringement ceiling per work across the plaintiff group. That number comes from newsletter reporting and hasn’t been confirmed against complaint text; the actual docket number hasn’t been publicly confirmed. Don’t treat $71.4 million as a verified demand, treat it as a placeholder until the complaint is accessible.

What can be said with confidence: this is a distinct legal action from Bartz, filed by plaintiffs who chose not to settle, advancing a legal theory centered on procurement rather than training. That’s new territory for AI copyright litigation.

The catch is how quickly courts might engage with the procurement question. Fair use litigation in the AI context has moved slowly, Thomson Reuters v. Ross has been in courts for years. A procurement theory could actually be simpler to litigate: did Anthropic download pirated material? Either it did or it didn’t. The factual question is narrower, even if the legal stakes are comparable.

Warning

The procurement theory changes the compliance question. It's not 'did we train on copyrighted material?' It's 'how did we obtain the data in the first place?' Companies that assembled training datasets from BitTorrent sources or third-party archives with unclear provenance face a different exposure than companies that trained on licensed or permissively scraped data.

For compliance teams at companies that assembled training datasets from web scrapes, BitTorrent sources, or purchased archives with unclear provenance: this lawsuit is a signal worth taking seriously now. The training-phase argument may never resolve cleanly. The procurement-phase argument might resolve faster and less favorably.

The real question is whether other plaintiffs in ongoing AI copyright cases adopt the procurement framing. If Shakespeare v. Anthropic gains traction, expect parallel amendments to complaints already in progress. Dataset provenance documentation, not just post-hoc training justification – becomes the compliance priority.

View Source
More Regulation intelligence
View all Regulation

Related Coverage

Stay ahead on Regulation

Get verified AI intelligence delivered daily. No hype, no speculation, just what matters.

Explore the AI News Hub