GPT-4 and other large language models can infer personal information such as location, age, and gender from conversations, a new study shows.
A study conducted by researchers at ETH Zurich raises new questions about the privacy implications of large language models. The study focuses on the ability of these models to infer personal attributes from chats or posts on social media platforms.
The study shows that the privacy risks associated with language models go beyond the well-known risks of data memorization. Previous research has shown that LLMs can store and potentially share sensitive training data.
GPT-4 can infer location, income, or gender with high accuracy
The team created a dataset of real Reddit profiles and shows that current language models - particularly GPT-4 - can infer a variety of personal attributes such as location, income, and gender from these texts. The models achieved up to 85% accuracy for the top 1 results and 95.8% for the top 3 results - at a fraction of the cost and time required by humans. As with other tasks, humans can achieve these accuracies and better - but GPT-4 comes very close to human accuracy and can do it all automatically and at high speed.
The study also warns that as people increasingly interact with chatbots in all aspects of their lives, there is a risk that malicious chatbots will invade privacy and attempt to extract personal information through seemingly innocuous questions.
The team shows that this is possible in an experiment in which two GPT-4 bots talk to each other: One is asked not to reveal its personal information, while the other designs targeted questions that allow it to extract more details through indirect information. Despite the limitations, GPT-4 can achieve 60 percent accuracy in predicting personal attributes using queries about things like the weather, local specialties, or sports activities.
Researchers call for broader privacy discussion
The study also shows that common mitigations such as text anonymization and model alignment are currently ineffective in protecting user privacy from language model queries. Even when text is anonymized using state-of-the-art tools, language models can still extract many personal characteristics, including location and age.
Language models often capture more subtle linguistic cues and contexts that are not removed by these anonymizers, the team said. Given the shortcomings of current anonymization tools, they call for stronger text anonymization methods to keep pace with the rapidly growing capabilities of the models.
In the absence of effective safeguards, the researchers argue for a broader discussion of the privacy implications of language models. Before publishing their work, they reached out to the major technology companies behind chatbots, including OpenAI, Anthropic, Meta, and Google.