summary Summary

The European Data Protection Board (EDPB) has published a preliminary report on investigations into ChatGPT by national data protection authorities. The authority sees several problematic practices in OpenAI's processing of personal data.


Until February 15, OpenAI did not have an office in the EU, which allowed national supervisory authorities to open investigations independently.

With the establishment of OpenAI Ireland Limited, the company now falls under the so-called one-stop shop mechanism. This means that the Irish DPA is expected to take primary responsibility for overseeing OpenAI in Europe.

The EDPB Task Force has developed a joint list of questions that has been submitted to OpenAI by several authorities. It covers all kinds of data protection issues, such as the legal basis for data processing, transparency for data subjects, data security, data retention periods, and data subjects' rights. The EDPB questionnaire can be found on page 9 of the report.


OpenAI is responsible for making sure it's GDPR-compliant, even if people put personal things in their prompts

The authority says that OpenAI aggregates a lot of personal data by reading publicly available sources (web scraping). In this case, the legitimate interest must be weighed against the interests of the data subjects. OpenAI must at least consider technical measures to exclude certain data categories and sources, and to anonymize or delete data before training.

The processing of special categories of personal data, such as data relating to health or sexual orientation, is only allowed under strict conditions. The authority says that just because users post something doesn't mean it can be used. It's also important to have filtering measures in place during and after data collection to exclude the relevant categories of data.

The report also looks at how user input is used to train language models. OpenAI says it does this because it has a good reason to do so, i.e. it has a "legitimate interest". The EDPB believes that users should at least be told about this, and that transparency is important when balancing interests.

OpenAI has made some improvements in this area since the launch of ChatGPT, but there's still a lot of confusion about when and how data is used for AI training, and the risks involved.

In addition, OpenAI shouldn't put the onus on users to ensure that their prompts are GDPR-compliant. If a publicly accessible chatbot is fed with personal data, the provider is still responsible for ensuring that its service remains GDPR-compliant.


Technical impossibility is not an argument for breaking the law

The EDPB reminds OpenAI that it should make it easy for data subjects to exercise their rights, including the rights of access, erasure and rectification. OpenAI should also improve the way it assists users in exercising these rights.

According to the report, "technical impossibility cannot be invoked to justify non-compliance" - a statement that could have far-reaching consequences, given that AI models, once trained, are largely static black boxes where the manufacturer cannot simply delete individual personal data from the model.

Overall, the EDPB believes that OpenAI and other providers of similar language models still have significant work to do to meet the requirements of the GDPR. Investigations by national supervisory authorities are ongoing, and the considerations in the report should be seen as preliminary assessments. Landmark decisions are still pending.

Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
  • The European Data Protection Board (EDPB) has released a preliminary report on investigations into ChatGPT's data processing practices by national data protection authorities. It highlights several issues with how OpenAI handles personal data under the GDPR.
  • The EDPB Task Force submitted a joint questionnaire to OpenAI covering topics such as the legal basis for data processing, transparency, data security, storage duration, and data subjects' rights. It emphasizes that OpenAI is responsible for GDPR compliance even when users input personal information.
  • The EDPB thinks that OpenAI and other language model providers need to take serious action to meet GDPR requirements. They should make it easier for data subjects to exercise their rights, and they can't use technical impossibility as an excuse for not complying. This could have serious consequences for AI models that are basically static black boxes once they're trained.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.