AI copyright litigation has been asking the same question for two years: did training on
copyrighted data without authorization constitute infringement? Courts are still working
through it. The cases are complex, the fair use defenses are contested, and the outcome
remains genuinely uncertain.
This week, the NYT’s lawyers filed a third amended complaint that asks a structurally
different question. Not “was training on our content infringement?” but “who built the
computer that made the training possible, and do they share liability?”
That’s the infrastructure liability theory. It’s newer, less tested, and, if it develops
traction, more far-reaching than anything the first generation of AI copyright suits
raised.
—
The two actions: what each actually alleges
Two legally distinct actions arrived in the same week. They’re related but shouldn’t be
conflated.
The WEHCO coalition. According to reports, a coalition of approximately 400 local and
regional U.S. newspapers, reportedly led by WEHCO Newspapers Inc. (publisher of the
Arkansas Democrat-Gazette and the Chattanooga Times Free Press), filed a federal copyright
suit against OpenAI and Microsoft around June 24, 2026. The complaint reportedly alleges
that automated tools were used to systematically scrape paywalled content from member
publications and strip copyright management information from articles in violation of the
DMCA. Specific tool names cited in the complaint could not be independently confirmed. These details are drawn from a single secondary source, court document verification is
pending and should be treated as unconfirmed specifics.
The NYT amended complaint. On June 25, 2026, the New York Times applied to submit a third
amended complaint in its existing copyright suit against OpenAI and Microsoft. This filing
is confirmed. 36Kr’s reporting on the
complaint confirms the core allegation: that Microsoft didn’t just provide generic cloud
computing services but “actively induced, assisted, and facilitated large-scale copyright
infringement” by building a purpose-specific supercomputing system. That system, according
to the complaint, contains more than 285,000 CPU cores and 10,000 GPUs, infrastructure
designed and built to enable OpenAI’s AI model training at scale.
The distinction between these two suits matters legally. The WEHCO coalition is asserting
direct and DMCA violations, it’s a claim about what happened to their content. The NYT’s
amended complaint is adding an affirmative inducement theory, it’s a claim about Microsoft’s
role in making the alleged infringement possible.
—
The infrastructure liability theory: what it means legally
Copyright law has long recognized a spectrum of liability beyond direct infringement. Contributory infringement holds a party liable if it knows of infringing activity and
materially contributes to it. Vicarious liability applies if a party profits from the
infringement and has the right and ability to supervise it. Inducement liability, the
theory closest to what NYT is asserting, holds that actively encouraging or enabling
infringement creates liability, even for a party that didn’t directly copy anything.
The NYT’s characterization of Microsoft as an entity that “actively induced, assisted, and
facilitated” the alleged infringement positions the argument firmly in inducement territory. That’s a deliberate choice. Pure contributory liability requires showing Microsoft knew
about specific infringing acts. Inducement can be shown through evidence of intent and
purpose-built enablement, and a 285,000-CPU supercomputer built specifically for AI
training is exactly the kind of purpose-built infrastructure that inducement arguments
reach for.
Microsoft’s likely response is that it was providing infrastructure services, not making
editorial decisions about what content to train on. General-purpose infrastructure
providers have historically had strong defenses against secondary liability. But the NYT’s
framing, purpose-built, customized, specific to OpenAI’s training operations, is
designed to undercut the “neutral infrastructure” defense.
Legal Theory Comparison: Two Precedents
Infrastructure Liability: Supply Chain Exposure
Courts have seen this argument structure before. The Sony Betamax case established that
technologies capable of substantial non-infringing uses generally don’t create liability
for their manufacturers. The Grokster case modified that: when a device is distributed
with the intent to promote infringing use, the manufacturer can be liable. The NYT’s
amended complaint is building a Grokster-style argument: Microsoft didn’t just sell
general cloud capacity, it built a custom machine intended to enable specific operations
that NYT claims were infringing.
Whether that argument succeeds depends entirely on what evidence exists about Microsoft’s
intent and knowledge when it built the system. That’s a fact-intensive inquiry that won’t
resolve quickly.
—
The litigation trajectory: how we got here
The NYT’s original suit against OpenAI was filed in December 2023, the opening shot in
what has become a wave of AI copyright litigation. TJS covered a significant development on
June 18 when the litigation landscape shifted toward new theories beyond the training
data question. The third amended complaint represents a further escalation: dropping some
theories that proved harder to sustain (TJS covered NYT dropping contributory claims in
its June 26 brief) and adding the infrastructure theory that NYT’s legal team apparently
now views as more promising.
That pattern, narrowing and sharpening claims over successive amendments, is common in
complex copyright litigation. It doesn’t necessarily indicate weakness; it often signals
that plaintiffs have learned what evidence exists and are focusing on the strongest claims.
The WEHCO coalition, if the reports are accurate, represents a different development:
organized collective action by local news organizations, many of which lack the legal
budget to litigate individually against companies of OpenAI’s and Microsoft’s scale. Coalition suits of this type are a recognized strategy for amplifying plaintiff leverage
in copyright cases.
—
Supply chain exposure: who else should be watching
The infrastructure liability theory, if it develops legal traction, has consequences well
beyond Microsoft.
Any company that provides purpose-built AI training infrastructure, specialized compute
clusters, optimized networking, model-specific storage systems, faces a version of the
same question: is providing infrastructure to a company that subsequently infringes
copyright enough for liability, when the infrastructure was designed with that company’s
training operations in mind?
That question reaches cloud providers, co-location facilities with AI-specific buildouts,
and hardware manufacturers who’ve sold purpose-configured systems to AI labs. It also
reaches API distributors and embedding service providers who might be characterized as
amplifying the reach of allegedly infringing model outputs.
Compliance and legal teams at companies in the AI supply chain should be reviewing their
indemnification arrangements now. Standard cloud computing contracts typically include
indemnification for IP claims arising from customer content, but those clauses weren’t
written with purpose-built AI training infrastructure in mind. Whether they cover
infrastructure liability theories is a question that should be answered before a court
asks it.
Warning
Standard cloud computing indemnification contracts typically cover IP claims arising from customer content, not infrastructure liability theories. Companies providing purpose-built AI training compute should verify whether their contracts cover the specific legal theory NYT is now asserting before courts develop the factual record.
What to Watch
—
What to watch
Three near-term signals matter:
First, whether the federal court accepts the NYT’s third amended complaint. If the court
rejects it, the infrastructure theory loses its most prominent test vehicle. If accepted,
discovery proceeds and the factual record about Microsoft’s knowledge and intent starts
to develop.
Second, court documents in the WEHCO coalition suit. The specific tool names, newspaper
count, and DMCA CMI removal allegations in the WEHCO complaint are currently single-source
and unconfirmed. PACER filings will either confirm or complicate the reported details.
Third, whether other publishers move to file similar coalition suits. The WEHCO model –
if confirmed, demonstrates an organizational structure for smaller publishers to pursue
claims they couldn’t sustain individually.
—
TJS synthesis
The infrastructure liability theory is the most consequential legal development in AI
copyright litigation since the original NYT filing in 2023. It doesn’t ask whether AI
training was lawful. It asks whether building the machine that made training possible
creates liability for the builder, and it does so with a specific, named supercomputing
system as its exhibit. Courts will take years to resolve this. But the companies whose
lawyers are reading this complaint carefully today and asking “does our indemnification
structure cover this?” are better positioned than those who wait for a ruling to find
out the answer is no.