The New York Times Co. v. Microsoft & OpenAI

Aug 23, 2025
2 min read

“NYT sues AI giants for ‘mass copyright infringement’—claims Times content was used to train ChatGPT without permission.”

Short Summary :

In December 2023, The New York Times filed a federal lawsuit against OpenAI and Microsoft, alleging that they used millions of Times articles without authorization to train AI models like ChatGPT. The complaint argues this constitutes copyright infringement, damages the newspaper’s business model, and unfairly competes with its journalism offerings. As of mid-2025, the court has allowed the core copyright claims to proceed, and OpenAI is appealing a data preservation order.

Facts :

⦁ Plaintiff: The New York Times — owner of copyrighted articles, in-depth investigations, opinion pieces, reviews, how-to guides, etc.

⦁ Defendants: OpenAI (creator of ChatGPT) and Microsoft (investor and provider of Azure infrastructure).

⦁ Allegations: The Times claims the defendants:

⦁ Used its content without permission to train AI models.

⦁ Generated text that recites content verbatim or mimics the Times’ expressive style.

⦁ Diverts subscription, licensing, affiliate, and advertising revenue by reproducing content.

⦁ Negotiations: The Times had attempted to negotiate licensing deals in April 2023 but failed.

⦁ Additional harms cited: Models hallucinating or attributing false information to the Times, further confusing attribution and undermining trust.

Findings / Procedural Developments :

⦁ On March 26, 2025, Judge Sidney Stein (S.D.N.Y.) allowed most of the copyright claims to proceed, possibly toward a jury trial. Some peripheral claims were dismissed, but the “core thrust” remains intact.

⦁ A motion to dismiss by the defendants was largely denied, enabling the case to move to discovery and potential summary judgment.

⦁ On June 6, 2025, OpenAI filed an appeal against a court order requiring indefinite preservation and segregation of ChatGPT output logs. OpenAI argues this conflicts with its user privacy commitments.

Suggestion / Implications :

⦁ Copyright Boundaries for AI: If upheld, the case may set precedent requiring AI developers to license copyrighted training data from news organizations, reshaping data acquisition models.

⦁ Fair Use Under Scrutiny: The fair use defense—common in AI arguments—could be significantly narrowed, especially when outputs compete with original content or replicate it verbatim.

⦁ Press Protections: Demonstrates publishers’ willingness and ability to enforce rights against tech giants, advocating compensation and attribution in AI use cases.

⦁ Privacy vs Discovery: The data preservation order clash reflects tension between litigation needs and AI services’ user privacy promises—potentially shaping discovery protocols for AI-generated outputs.