Content
summary Summary

Microsoft is taking a new approach to using copyrighted books for AI training by offering payment to HarperCollins authors. The deal sheds light on how the industry values creative work in the AI era.

Ad

The company has proposed a licensing agreement with publisher HarperCollins that would pay $5,000 per book for AI training rights. Authors would receive half of that amount, or $2,500 per book, according to the publisher.

Alice Robb, who covered the story for Bloomberg, received the same HarperCollins offer for her 2018 book "Why We Dream." The deal gives Microsoft a three-year training license, with authors free to accept or decline.

But putting a price tag on these rights isn't simple. "My first impulse was to outsource the decision to my agent, but she demurred," Robb writes. The contract had no precedent or room for negotiation, and she had just one week to choose.

Ad
Ad

Robb ultimately took the deal, though she's unsure it was the right call. "As far as I can tell, neither does anyone else," she writes, noting that her eight-year-old book has already been used to train AI systems without permission anyway - likely by Microsoft or OpenAI.

The decision becomes even harder given authors' financial struggles, Robb notes. The Authors Guild reports that full-time authors' median annual income is just $20,000. In the UK, professional writers earn a median of £7,000 (about €8,400) yearly.

From piracy to payment

Brown University economist Emily Oster sees Microsoft's approach as calculated: "They’re trying to establish the idea that the rights to train on books are worth $5,000. You can’t do that by going to the latest bestseller. So you do that by going to the backlist — to people who aren’t collecting royalties — and telling them, ‘Look, would you like some free money?’"

While Microsoft is seeking licenses in this case, other AI companies claim that "fair use" allows them to train AI on copyrighted works without payment. They argue that transforming existing data into new products supersedes copyright law. Authors, publishers, and artists disagree, leading to multiple lawsuits.

Meta recently showed how ruthlessly AI companies collect training data. Court documents revealed that despite internal warnings, the company deliberately used piracy networks to download copyrighted books for AI training and systematically removed copyright notices.

Recommendation

Microsoft's and OpenAI's move toward licensing suggests that the big AI labs may be backing away from their stance that using copyrighted content without permission is legal, and taking a more thoughtful approach. Some AI labs are even buying second-tier video content from YouTube creators.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Microsoft has offered HarperCollins a licensing deal to use their books to train AI models, with authors receiving half of the $5,000 per book fee.
  • The offer has left many authors unsure whether to accept, as it's difficult to determine whether the compensation is fair. But the average annual income for authors in the U.S. is only $20,000 or less.
  • Authors are grappling with the decision to allow their work to be used in AI training, weighing the financial benefits against the potential impact on their intellectual property and the uncertainty of fair compensation in the context of generative AI.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.