Content
summary Summary

Meta used content from Facebook and Instagram to train its new AI assistant, Meta AI. The company might use your input data to improve Meta AI.

Meta AI is based on a custom Llama 2 model combined with Meta's new Emu image model. During Meta's Connect 2023 conference, Nick Clegg, Meta's president of global affairs, told Reuters that the new AI assistant was trained with content from Facebook and Instagram, in addition to publicly available datasets. The text content went into Llama, and the images went into Emu.

According to Clegg, only publicly available posts were used for training. Private posts shared only with family and friends and private messages were excluded. According to Clegg, Meta also avoided public datasets with "a heavy preponderance of personal information."

Much of the data Meta uses for training is publicly available, Clegg said. Data from LinkedIn, for example, would not be used for training. If you feed data into Meta AI, Meta may use it to improve Meta AI's capabilities, a spokesperson told Reuters.

Ad
Ad

AI Copyright: Clegg expects tough trials

Meta's chief lobbyist expects "fair amount of litigation" in the debate over whether the use of copyrighted data falls under the fair use doctrine.

The fair use doctrine holds that the research and development of fundamentally new technologies or content can circumvent copyright law. AI companies like OpenAI, which has been sued several times, will invoke fair use in upcoming court cases. Clegg expects the courts to agree.

For Meta AI, Meta has built in safeguards to avoid abuse. These include preventing the generation of realistic photos of famous people and content that violates copyright laws.

In its latest image model, DALL-E 3, OpenAI also prevents the generation of images based on the style-defining names of well-known living artists. In addition, the company is offering artists the option to remove their images from the training data of future models.

In addition to its ChatGPT competitor Meta AI, Meta demonstrated a range of generative AI applications for its social platforms at Connect 2023. These include personalized AI chats based on celebrities, AI-powered image editing, and text-based sticker generation.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Meta AI, Meta's new AI assistant, was trained using content from Facebook and Instagram, in addition to public datasets. Only publicly available posts were used for training, not private posts or messages. Meta can use the data you feed into Meta AI to further improve its capabilities.
  • Nick Clegg, president of global affairs at Meta, expects a lot of litigation over whether the use of proprietary data falls under the fair use doctrine. The fair use doctrine states that research and development of new technology or content can limit copyright.
  • Meta has built safeguards into Meta AI to prevent abuse, including preventing the creation of realistic photos of well-known people and content that infringes copyright.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.