Microsoft is being sued by several authors who say their books were used without permission to train a Megatron model. The lawsuit, filed in federal court in New York, claims Microsoft used a dataset of about 200,000 pirated books to build a system that mimics the style, voice, and themes of the original works. The plaintiffs are asking for a ban on further use and up to $150,000 in damages per title.

Ad

Courts in similar cases involving Meta and Anthropic have said such use may qualify as "transformative" under fair use rules. But it is still unclear if using pirated books overrides fair use, or if scraping copyrighted content from the internet is considered legal and to which extent, and whether this harms the market for the original books, which could prevent the use from being considered fair use.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Sources
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.