In February, the 3D database Sketchfab introduced a NoAI tag for authors. But that was too late, as it now turns out.
The latest variant of generative AI models such as DALL-E 2, Midjourney, Stable Diffusion, ChatGPT, or recently GPT-4 brought a rude awakening: AI models for images, text, and code have the potential to dramatically change and even cost jobs, especially those of people whose years of work have been collected en masse as training data from the Internet.
As a result, lawsuits are underway to determine whether data released under Creative Commons licenses is also free for AI training. In some cases, companies like Stability AI give artists the option to opt out of training data or compensate them for their participation, as Adobe or Getty Images do.
Many public databases have also introduced new terms that allow creators to prohibit the use of their data for AI training.
Generative AI models for 3D are emerging
After text, code, and images, it has been clear for some time that generative AI models for 3D content will be the next target. The first models already exist, but their quality is still far from that of their text and 2D counterparts. The main reason for this is the lack of a large dataset of 3D content.
That has now changed: Researchers at the Allen Institute for AI and the University of Washington have released Objaverse, a massive 3D dataset. Objaverse contains more than 800,000 3D models with descriptions, including more than 44,000 animated 3D objects.
This makes Objaverse more than an order of magnitude larger than the largest dataset to date, Shapenet, and contains nearly 400 times as many categories, including photorealistic models.
Objaverse data is scraped from Sketchfab
The Objaverse data comes from the 3D platform Sketchfab and is licensed under the Creative Commons License. The authors whose 3D models are included in the dataset were not informed that their data was collected for AI training.
This is particularly controversial because the dataset includes 3D models whose creators have set the NoAI tag, which Sketchfab introduced in February. This is supposed to prevent exactly what happened.
- Austin Beaulier (@AustinBeaulier) March 24, 2023
The company also has a direct agreement with EpicGames, which acquired Sketchfab in the summer of 2021, that its hosted 3D models cannot be used by EpicGames to train generative AI models.
Sketchfab's NoAI tag came too late
"It appears Objaverse mass-downloaded these models from Sketchfab and redistributed them without our knowledge. So far, all of the models that we’ve seen were made downloadable for free on Sketchfab under Creative Commons licenses," Sketchfab said in a statement. The problem: "They did this before us implementing the noai tag," said Alban Denoyel, co-founder and CEO of Sketchfab.
Looks like I'm blocked by @lizaledwards.
Just a few things for context:
- those models were mass aggregated by objaverse without our knowledge
- they did this before us implementing the noai tag
- it's CC content set downloable by users
- we are looking into what resort we have
- alban denoyel (@albn) March 24, 2023
In short, Sketchfab took action against such a situation with the NoAI tag - but it was too late. "We understand artists’ concerns and are looking into it," the company said.
Whether the creators have any legal recourse against the dataset will probably only become clear in the next few months. That's when the first results of the trial in the dispute over generative AI models for text-to-image should be available.
More information about Objaverse can be found on the Objaverse project page.