German court allows non-profit LAION to scrape copyrighted images for AI training

Sep 30, 2024

Midjourney prompted by THE DECODER

Update – Sep 30, 2024

Added statement from the law firm representing LAION

A Hamburg court has ruled that LAION, a non-profit organization, can collect copyrighted images for training AI systems without getting permission from a photographer. The decision leaves the most interesting question unanswered.

In a case between a photographer and LAION, the Hamburg Regional Court sided with LAION (case number 310 O 227/23). The nonprofit, specializing in creating data sets for AI training, had taken an image from a photo agency's website, paired it with a description, and added the URL and description to its freely available "LAION-5B" dataset of 5.85 billion image-text pairs. The photographer sued LAION for copyright infringement.

The court confirmed that downloading and processing the image constituted a copyright-relevant reproduction. However, it ruled this action was justified under Section 60d of German copyright law, which permits text and data mining for non-commercial scientific research.

The court focused on LAION's specific actions, not its organizational structure. Since LAION released the dataset freely for research, it wasn't pursuing commercial goals. The fact that companies also use the dataset didn't matter.

"The ruling of the Hamburg Regional Court creates an important basis for the legally compliant use of publicly accessible data in the context of scientific research. It confirms that the association can continue to make a significant contribution to the promotion of open-source initiatives in the future, which promotes AI development in Germany in particular," writes Heidrich Rechtsanwälte, the law firm representing LAION.

Fair use issue still unresolved

The court didn't need to decide if LAION could also use Section 44b, a more general exception for text and data mining. This section allows copying legally accessible works for text and data mining, which is defined as automated analysis of digital works to extract information about patterns, trends, and correlations. Copies must be deleted when no longer needed for mining.

However, rights holders can reserve these uses, but only if done in machine-readable form for online works. The court doubted the photo agency's website had such a machine-readable notice restricting use. Given the importance of the case, the photographer is likely to appeal to a higher court.

The ruling shows that research groups can collect AI training data. But it's unclear whether this applies to for-profit companies, and it's only about collecting the data, not actually using it to train AI systems. Companies like OpenAI have done both: they've taken copyrighted online data without permission and used it to train their systems.

There are several lawsuits pending on this issue, the most high-profile of which is probably the one between the New York Times and OpenAI.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

German court allows non-profit LAION to scrape copyrighted images for AI training

Update – Sep 30, 2024

Fair use issue still unresolved

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.