Content
summary Summary

GPT-4 and other large language models can infer personal information such as location, age, and gender from conversations, a new study shows.

A study conducted by researchers at ETH Zurich raises new questions about the privacy implications of large language models. The study focuses on the ability of these models to infer personal attributes from chats or posts on social media platforms.

The study shows that the privacy risks associated with language models go beyond the well-known risks of data memorization. Previous research has shown that LLMs can store and potentially share sensitive training data.

GPT-4 can infer location, income, or gender with high accuracy

The team created a dataset of real Reddit profiles and shows that current language models - particularly GPT-4 - can infer a variety of personal attributes such as location, income, and gender from these texts. The models achieved up to 85% accuracy for the top 1 results and 95.8% for the top 3 results - at a fraction of the cost and time required by humans. As with other tasks, humans can achieve these accuracies and better - but GPT-4 comes very close to human accuracy and can do it all automatically and at high speed.

Ad
Ad
Image: Staab et al.

The study also warns that as people increasingly interact with chatbots in all aspects of their lives, there is a risk that malicious chatbots will invade privacy and attempt to extract personal information through seemingly innocuous questions.

The team shows that this is possible in an experiment in which two GPT-4 bots talk to each other: One is asked not to reveal its personal information, while the other designs targeted questions that allow it to extract more details through indirect information. Despite the limitations, GPT-4 can achieve 60 percent accuracy in predicting personal attributes using queries about things like the weather, local specialties, or sports activities.

Image: Staab et al.

Researchers call for broader privacy discussion

The study also shows that common mitigations such as text anonymization and model alignment are currently ineffective in protecting user privacy from language model queries. Even when text is anonymized using state-of-the-art tools, language models can still extract many personal characteristics, including location and age.

Language models often capture more subtle linguistic cues and contexts that are not removed by these anonymizers, the team said. Given the shortcomings of current anonymization tools, they call for stronger text anonymization methods to keep pace with the rapidly growing capabilities of the models.

In the absence of effective safeguards, the researchers argue for a broader discussion of the privacy implications of language models. Before publishing their work, they reached out to the major technology companies behind chatbots, including OpenAI, Anthropic, Meta, and Google.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • A new study by researchers at ETH Zurich shows that current language models, in particular GPT-4, are able to accurately infer personal attributes such as location, income, and gender from text data such as chats or social media posts.
  • In a test using Reddit data, GPT-4 achieved up to 85% accuracy.
  • The study also warns of the potential privacy risks of interacting with malicious chatbots that could spy out personal information by asking seemingly innocuous questions.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.