OpenAI’s language model GPT-3 can’t tell fact from fiction. An automated look at the web is supposed to help artificial intelligence do this.
OpenAI’s huge language model GPT-3 is suitable for all kinds of text tasks, but repeatedly produces misinformation. Especially in tasks that require very specific factual knowledge about the world or whose answers were not part of the training material, GPT-3 regularly “hallucinates” incorrect information, according to OpenAI.
To combat such misinformation, OpenAI is turning to Internet searches: a new variant of GPT-3 has learned to search the Internet for answers.
OpenAI WebGPT searches for answers
The variant is called WebGPT and can ask search queries, follow links, scroll up and down web pages, and prove the sources of the answers it finds. This should make it easier to provide feedback to the AI system and increase its accuracy.
OpenAI’s WebGPT builds on the company’s other work in secure artificial intelligence: In September 2020, a team demonstrated an AI system for summaries that was improved with human feedback. This was followed in September 2021 by an AI system that can summarize entire books and also relies on human feedback for optimization.
However, both systems additionally use an algorithm that uses reinforcement learning to learn human preferences from the given feedback and then further trains the summarization system. This reduces the required human feedback and the associated cost.
WebGPT learns from humans and machines
WebGPT also learns from human examples as well as an algorithm that has analyzed what types of answers people prefer to questions. First, WebGPT thus learns from demonstrations of using a web browser to answer questions. Feedback from the second algorithm then improves the accuracy of the answers.
In tests with questions from the ELI5 and TruthfulAQ datasets, WebGPT performs significantly better than GPT-3, but continues to fall short of the quality of human answers. The approach is promising. However, OpenAI says, and will now be improved with adversarial training and automated debates between multiple models.
OpenAI warns against tampering and internet access
According to OpenAI, however, better versions of WebGPT also carry risks. For example, automatic source citation conveys a certain authority that is not always appropriate because the quality of the source is not verified. A better system could also only pick sources that it expected people to find convincing – even if the source contained errors.
The current WebGPT has limited Internet access and, according to an assessment of GPT-3’s capabilities, is incapable of abusing that access. However, with better models, the risk of giving an AI system full internet access increases, OpenAI writes. The company is therefore already developing internal security mechanisms.