summary Summary

In February, the 3D database Sketchfab introduced a NoAI tag for authors. But that was too late, as it now turns out.


The latest variant of generative AI models such as DALL-E 2, Midjourney, Stable Diffusion, ChatGPT, or recently GPT-4 brought a rude awakening: AI models for images, text, and code have the potential to dramatically change and even cost jobs, especially those of people whose years of work have been collected en masse as training data from the Internet.

As a result, lawsuits are underway to determine whether data released under Creative Commons licenses is also free for AI training. In some cases, companies like Stability AI give artists the option to opt out of training data or compensate them for their participation, as Adobe or Getty Images do.

Many public databases have also introduced new terms that allow creators to prohibit the use of their data for AI training.


Generative AI models for 3D are emerging

After text, code, and images, it has been clear for some time that generative AI models for 3D content will be the next target. The first models already exist, but their quality is still far from that of their text and 2D counterparts. The main reason for this is the lack of a large dataset of 3D content.

That has now changed: Researchers at the Allen Institute for AI and the University of Washington have released Objaverse, a massive 3D dataset. Objaverse contains more than 800,000 3D models with descriptions, including more than 44,000 animated 3D objects.

The quality of the 3D models in the Objaverse dataset are significantly higher than in the best available dataset so far, ShapeNet. | Image: Screenshot from AI2 Objaverse website

This makes Objaverse more than an order of magnitude larger than the largest dataset to date, Shapenet, and contains nearly 400 times as many categories, including photorealistic models.

Objaverse includes over 800,000 3D models from Sketchfab. | Image: Deitke et al.

Objaverse data is scraped from Sketchfab

The Objaverse data comes from the 3D platform Sketchfab and is licensed under the Creative Commons License. The authors whose 3D models are included in the dataset were not informed that their data was collected for AI training.

This is particularly controversial because the dataset includes 3D models whose creators have set the NoAI tag, which Sketchfab introduced in February. This is supposed to prevent exactly what happened.


The company also has a direct agreement with EpicGames, which acquired Sketchfab in the summer of 2021, that its hosted 3D models cannot be used by EpicGames to train generative AI models.

Sketchfab's NoAI tag came too late

"It appears Objaverse mass-downloaded these models from Sketchfab and redistributed them without our knowledge. So far, all of the models that we’ve seen were made downloadable for free on Sketchfab under Creative Commons licenses," Sketchfab said in a statement. The problem: "They did this before us implementing the noai tag," said Alban Denoyel, co-founder and CEO of Sketchfab.

Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

In short, Sketchfab took action against such a situation with the NoAI tag - but it was too late. "We understand artists’ concerns and are looking into it," the company said.

Whether the creators have any legal recourse against the dataset will probably only become clear in the next few months. That's when the first results of the trial in the dispute over generative AI models for text-to-image should be available.

More information about Objaverse can be found on the Objaverse project page.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
  • Objaverse is a new dataset of 3D models designed to enable generative AI for 3D. The data comes from Sketchfab.
  • The company said the data was collected en masse without its or the artists' knowledge.
  • In February, Sketchfab introduced a NoAI tag to prevent this from happening - too late, as it turns out.
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.