Shakespeare v. Anthropic: 100 Authors Sue Over Data Torrenting, Not AI Training

June 18, 2026 2 min read Pascal's Substack Partial Very Weak

Tech Jacks Solutions AI News Coverage

One hundred authors who rejected the Bartz v. Anthropic settlement have filed a new copyright lawsuit targeting how Anthropic allegedly obtained training data, not what the company did with it. The procurement theory, if courts accept it, bypasses every fair use argument Anthropic has built.

Opt-out authors filing, 100

Key Takeaways

100 Bartz opt-out authors filed a new copyright suit against Anthropic targeting dataset procurement via BitTorrent, not model training, a legally distinct theory
The complaint reportedly alleges use of Books3, LibGen, and PiLiMi datasets obtained via
BitTorrent; allegations have not been confirmed against the complaint text
Plaintiffs reportedly seek up to $71.4 million in maximum statutory damages, figure is unconfirmed without verified complaint access
If the procurement theory succeeds, fair use arguments built around transformative training use become legally irrelevant for this class of claim

Verdict

New copyright complaint filed targeting dataset torrenting, not model training

CourtN.D. Cal. (docket number unconfirmed)

Date2026-06-17

ImplicationsProcurement-phase infringement theory bypasses transformative use defense

Verification

Partial Newsletter reporting (Pascal's Substack; Chat GPT Is Eating the World) Docket number not confirmed; $71.4M damages figure and dataset list not verified against complaint text

A new lawsuit filed in the U.S. District Court for the Northern District of California on or around June 17, 2026 targets Anthropic on copyright grounds, but the legal theory is different from anything that has come before it in AI copyright litigation. The plaintiffs aren’t arguing about whether training AI models on copyrighted books is transformative. They’re arguing that Anthropic committed infringement before the first model ever trained, by torrenting and retaining pirated datasets.

Lead plaintiff Thomas William Shakespeare, described in reporting as a British sociologist and bioethicist, heads a group of approximately 100 authors, according to Pascal’s Substack. All are opt-outs from the Bartz v. Anthropic class action settlement. Filing a new suit rather than accepting the settlement terms signals these plaintiffs believe they have a stronger theory than the one Bartz pursued.

The procurement theory, as characterized by legal analysts covering the case, focuses on the alleged copying and retention of three datasets: Books3, LibGen, and PiLiMi, reportedly obtained via BitTorrent. The argument, as analyzed in legal commentary, is that downloading and storing pirated material is itself an act of infringement, separate from and prior to any training use. Under this framing, Anthropic’s position that model training is transformative doesn’t matter. The infringement allegedly occurred at the point of acquisition.

The damages figure circulating in reporting is up to $71.4 million in maximum statutory damages – reportedly the result of applying the $150,000 willful infringement ceiling per work across the plaintiff group. That number comes from newsletter reporting and hasn’t been confirmed against complaint text; the actual docket number hasn’t been publicly confirmed. Don’t treat $71.4 million as a verified demand, treat it as a placeholder until the complaint is accessible.

What can be said with confidence: this is a distinct legal action from Bartz, filed by plaintiffs who chose not to settle, advancing a legal theory centered on procurement rather than training. That’s new territory for AI copyright litigation.

The catch is how quickly courts might engage with the procurement question. Fair use litigation in the AI context has moved slowly, Thomson Reuters v. Ross has been in courts for years. A procurement theory could actually be simpler to litigate: did Anthropic download pirated material? Either it did or it didn’t. The factual question is narrower, even if the legal stakes are comparable.

Warning

The procurement theory changes the compliance question. It's not 'did we train on copyrighted material?' It's 'how did we obtain the data in the first place?' Companies that assembled training datasets from BitTorrent sources or third-party archives with unclear provenance face a different exposure than companies that trained on licensed or permissively scraped data.

For compliance teams at companies that assembled training datasets from web scrapes, BitTorrent sources, or purchased archives with unclear provenance: this lawsuit is a signal worth taking seriously now. The training-phase argument may never resolve cleanly. The procurement-phase argument might resolve faster and less favorably.

The real question is whether other plaintiffs in ongoing AI copyright cases adopt the procurement framing. If Shakespeare v. Anthropic gains traction, expect parallel amendments to complaints already in progress. Dataset provenance documentation, not just post-hoc training justification – becomes the compliance priority.

More coverage of Anthropic

Regulation Jun 17

Fable 5 Dispute Hardens: Sacks Says Anthropic Refused the Fix. Anthropic Disputes the Threat.

Markets Deep Dive Jun 17

Physical AI's Funding Moment: What Odyssey, Prometheus, and PhysicsX Tell Investors About 2026's Capital...

Markets Jun 18

Anthropic Opens Seoul Office With Six Enterprise Deals While Its Top Models Remain Export-Blocked

Markets Deep Dive Jun 17

The Application-Layer Trap: What Cursor's Collapse Predicts for AI Tools Built on a Rival's...

Markets Jun 17

Odyssey Raises $310M at $1.45B Valuation to Build AI World Models, AWS Locks In...

View Source

More Regulation intelligence

View all Regulation

Deep Dive Available Four Stakeholders, One Override: The Fable 5 Power Map After the Pushback

Gallery

Contacts