NYT slams OpenAI's request for reporter notes as "unprecedented" and "harassing"

Jul 29, 2024

Stéphan Valentin, Unsplash / OpenAI, prompted by THE DECODER

OpenAI is seeking access to research materials and other internal documents from the New York Times as part of an ongoing copyright lawsuit. The Times strongly opposes this request, viewing it as an attempt to intimidate journalists.

As part of the newspaper's copyright lawsuit against the AI company, OpenAI is now demanding access to research documents, notes, and other internal materials from the Times, arguing that these documents are essential to evaluating the copyright status of the Times' content.

The Times strongly opposes releasing the requested documents. It calls OpenAI's demands "unprecedented" and "harassing," seeing them as an attempt to undermine established intellectual property rights and intimidate the news organization.

Image: Case 1:23-cv-11195-SHS Document 156 (screenshot)

The Times argues the copyright nature of its articles should be judged based on the published works themselves, not reporters' private notes or interview materials. The newspaper warns this could have a chilling effect on journalists and news organizations.

OpenAI's lawyers counter that the New York Times created this relevance by claiming in its original complaint that the allegedly copyrighted works required "enormous amount of time … expertise, and talent," and "deep investigations."

The New York Times filed a copyright infringement lawsuit against OpenAI and Microsoft in late 2023, after demonstrating that ChatGPT was in some cases reproducing NYT content verbatim. OpenAI responded with a partial motion to dismiss, accusing the Times of creating copies of NYT content through targeted "prompt hacking" and a massive number of attempts, in violation of OpenAI's terms of service.

What about fair use?

Interestingly, it appears that OpenAI's lawyers are not directly seeking a fair use ruling, but are instead focusing on other potentially copyright and technology-related issues. If the NYT's case is dismissed on these grounds, OpenAI would win the case - but a fair use ruling, and a potential definitive solution to AI training and what data can be used for it, would remain open.

OpenAI and Microsoft may be playing for time until fair use becomes less relevant to AI companies as they find ways to train AI models without potentially infringing copyright, such as using a mix of high-quality licensed data and synthetic datasets derived from it.

Meta recently released updates to the open-source Llama 3 model, whose smaller models in version 3.1 were improved using training data from the larger 405B Llama model, which acted as a "teacher model." The often feared "model collapse" apparently did not occur. OpenAI CEO Sam Altman recently said that the future of AI training is about learning more from less data.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

NYT slams OpenAI's request for reporter notes as "unprecedented" and "harassing"

What about fair use?

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.