A federal court's ruling against Ross Intelligence's use of copyrighted material for AI training may have limited implications for major AI companies, thanks to key differences in how their systems work.
A US federal court has rejected Ross Intelligence's fair use defense, ruling that the company's use of copyrighted material to train its AI violated copyright law. Ross had obtained roughly 25,000 legal summaries from Thomson Reuters' Westlaw database through indirect means and converted them into training data for its AI system.
The court evaluated four traditional fair use factors: purpose and character of use, nature of the copyrighted work, amount and significance of the portion used, and effect on potential market value. Ross' defense failed on all counts.
Two factors particularly worked against Ross: the commercial nature of its use and lack of "transformative" value. The company had simply converted legal summaries into numerical data about word relationships to train its AI, but was essentially creating a competing product to Thomson Reuters' system.
Market impact proved central to the ruling. The court considered both the existing legal research market and a potential market for AI training data at risk from Ross' actions. The ruling placed the burden of proof on Ross to show such markets wouldn't be affected - a detail that could influence future cases, particularly since some major AI labs already purchase training data, effectively acknowledging its commercial value beyond fair use.
LLMs are different
Notably, the court explicitly stated that this ruling applies only to non-generative AI. Unlike language models that learn from existing content to generate new material, Ross' system directly competed by providing the same service as Thomson Reuters - returning existing court decisions.
![Text excerpt from court decision: Ross vs. Thomson Reuters for the non-transformative use of headnotes by Ross's AI system for legal research.](https://the-decoder.com/wp-content/uploads/2025/02/genai_lawsuit_thomson_reuters.png)
This distinction aligns with the fair use argument of OpenAI and other AI labs: their training data is used to develop general language (or coding, musical, artistic etc.) skills, not to reproduce or compete with the original content. In addition, the training data is only "viewed" by the AI system during training, not copied into the final model - similar to a student learning from a textbook. However, the data must be stored, at least temporarily, for the training process to work.
Different AI cases may play out differently
Courts appear to be taking a case-by-case approach to AI copyright issues. A single, definitive ruling on fair use for generative AI training data isn't expected anytime soon. Instead, different precedents will likely emerge as courts grapple with various AI systems and their unique contexts.
The Ross case was relatively straightforward: one legal database competing directly with another. But future cases raise more nuanced questions: Is a chatbot truly competing with a news website? Does an AI music generator actually compete with human musicians? When does AI-generated content become a substitute for the original work?
These questions are far more complex than the Ross case and will likely require careful consideration by courts. A recent lawsuit by news sites Raw Story and AlterNet against OpenAI illustrates this complexity. The judge dismissed the case, accepting OpenAI's fair use defense that ChatGPT creates new content rather than copying articles directly, and that facts themselves aren't protected by copyright.
For Ross Intelligence, the legal battle proved fatal. The company shut down in 2021, unable to raise enough funds to continue operating while fighting what it called an unfounded lawsuit.