Content
summary Summary
Update
  • Added information about the firing of Shira Perlmutter following the report's release.

The US Copyright Office has pushed back against one of the AI industry's most common legal arguments: that training AI models on copyrighted material generally qualifies as fair use.

Ad

In a new report, the agency rejects several of the industry's key justifications—like comparing AI training to human learning or claiming that it's a "non-expressive" use. That argument assumes models are merely identifying statistical patterns in the data rather than reproducing creative expression.

The Copyright Office disagrees. If an AI model generates output that resembles human-created work in terms of style, function, or expression, then that output is considered "expressive." And if that output competes with the original works in the market, it weighs against a fair use defense.

A central argument in the report is that AI systems process information fundamentally differently from humans. While people retain partial, filtered impressions of creative works—shaped by memory, personality, and context—AI models ingest perfect copies, analyze them almost instantly, and generate new content at "superhuman speed and scale," according to the Copyright Office.

Ad
Ad

"Generative model training transcends the human limitations that underlie the structure of the exclusive rights."

Professor Robert Brauneis, Copyright and the Training of Human Authors and Generative Machines

Update: Shortly after the report was released, the Trump administration fired Shira Perlmutter, head of the U.S. Copyright Office. The move drew immediate backlash. "Donald Trump's termination of Register of Copyrights, Shira Perlmutter, is a brazen, unprecedented power grab with no legal basis. It is surely no coincidence he acted less than a day after she refused to rubber-stamp Elon Musk's efforts to mine troves of copyrighted works to train AI models," wrote Rep. Joe Morelle, the top Democrat on the Committee on House Administration.

Licensing, not litigation

The full report leaves room for some narrow exceptions. Certain training uses might be transformative enough to qualify as fair use, depending on several factors: what kind of work is being used, how it was obtained, the purpose of the training, and whether the resulting output is controlled or competes with the original. In research or analytical contexts, for example, generated content is less likely to serve as a substitute for the original and may lean toward fair use.

But when it comes to commercial AI systems that use "vast troves of copyrighted works to produce expressive content that competes with them in existing markets," the Copyright Office draws a clear line, stating that this "goes beyond established fair use boundaries."

How the training data was obtained also matters. Using illegally sourced works—like those taken from piracy sites or behind paywalls—hurts the fair use argument, the agency says, and some current datasets appear to include such material.

Rather than calling for new legal restrictions, the Copyright Office urges further development of voluntary licensing markets. Early forms of individual and collective licensing are emerging in some sectors, and for areas where licensing systems don't yet exist, the agency suggests alternatives like extended collective licensing.

Recommendation

At this stage, the Copyright Office sees government intervention as premature, citing both the early development of licensing markets and a lack of consensus for new laws.

No blanket fair use, but no outright ban

Despite rejecting industry-wide fair use claims, the Copyright Office stops short of calling for a general ban on AI training. It stresses that fair use is a flexible legal doctrine that has adapted to past waves of technological change and should remain that way.

According to the report, the best way to maintain the United States' leadership in AI is to support both innovation and copyright protection. The goal, the office says, is to ensure that these technologies benefit not only the developers building the models but also the creators whose content powers them—and ultimately, the public at large. The Copyright Office says it will continue to advise Congress on the issue.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • The US Copyright Office has challenged the AI industry's claim that training models on copyrighted works is generally fair use, arguing that AI-generated outputs which resemble human-created works—especially if they compete in the market—may not qualify for this defense.
  • The agency notes a fundamental difference between how AI systems and humans process creative content, highlighting that AI can ingest perfect copies and generate new content at speeds and scales far beyond human capability.
  • While stopping short of advocating for new legal restrictions or a general ban, the Copyright Office urges the development of voluntary licensing markets and emphasizes that fair use should remain flexible, balancing innovation with copyright protection.
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.