Content
summary Summary

OpenAI is seeking access to research materials and other internal documents from the New York Times as part of an ongoing copyright lawsuit. The Times strongly opposes this request, viewing it as an attempt to intimidate journalists.

Ad

As part of the newspaper's copyright lawsuit against the AI company, OpenAI is now demanding access to research documents, notes, and other internal materials from the Times, arguing that these documents are essential to evaluating the copyright status of the Times' content.

The Times strongly opposes releasing the requested documents. It calls OpenAI's demands "unprecedented" and "harassing," seeing them as an attempt to undermine established intellectual property rights and intimidate the news organization.

Image: Case 1:23-cv-11195-SHS Document 156 (screenshot)

The Times argues the copyright nature of its articles should be judged based on the published works themselves, not reporters' private notes or interview materials. The newspaper warns this could have a chilling effect on journalists and news organizations.

Ad
Ad

OpenAI's lawyers counter that the New York Times created this relevance by claiming in its original complaint that the allegedly copyrighted works required "enormous amount of time … expertise, and talent," and "deep investigations."

The New York Times filed a copyright infringement lawsuit against OpenAI and Microsoft in late 2023, after demonstrating that ChatGPT was in some cases reproducing NYT content verbatim. OpenAI responded with a partial motion to dismiss, accusing the Times of creating copies of NYT content through targeted "prompt hacking" and a massive number of attempts, in violation of OpenAI's terms of service.

What about fair use?

Interestingly, it appears that OpenAI's lawyers are not directly seeking a fair use ruling, but are instead focusing on other potentially copyright and technology-related issues. If the NYT's case is dismissed on these grounds, OpenAI would win the case - but a fair use ruling, and a potential definitive solution to AI training and what data can be used for it, would remain open.

OpenAI and Microsoft may be playing for time until fair use becomes less relevant to AI companies as they find ways to train AI models without potentially infringing copyright, such as using a mix of high-quality licensed data and synthetic datasets derived from it.

Meta recently released updates to the open-source Llama 3 model, whose smaller models in version 3.1 were improved using training data from the larger 405B Llama model, which acted as a "teacher model." The often feared "model collapse" apparently did not occur. OpenAI CEO Sam Altman recently said that the future of AI training is about learning more from less data.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • OpenAI is demanding access to research papers, notes and other internal documents from the New York Times as part of the newspaper's copyright lawsuit against the AI company, arguing that they are crucial to assessing whether the Times' copyright claims are valid.
  • The New York Times vehemently opposes the release of the requested documents, calling OpenAI's demands "unprecedented," "harassing," and an attempt to undermine established intellectual property rights and intimidate the news organization.
  • OpenAI's lawyers appear to be focusing on copyright and technical issues rather than directly seeking to clarify fair use, possibly playing for time until fair use becomes less relevant to AI companies as they find ways to train AI models without potentially infringing copyright.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.