A new method can correct and update large AI models

Large AI networks like language models make mistakes or contain outdated information. MEND shows how to update LLMs without changing the whole network.

Large AI models have become standard in many AI applications, such as natural language processing, image analysis, and image generation. The models, such as OpenAI's GPT-3, often have more diverse capabilities than small, specialized models and can be further improved via finetuning.

However, even the largest AI models regularly make mistakes and additionally contain outdated information. GPT-3's most recent data is from 2019 - when Theresa May was still prime minister.

Large AI models must become updatable and correctable

To re-train GPT-3 would be an enormous effort and is not a solution - neither economically nor in principle, since the model would be outdated again within a few weeks. In addition, many flaws only become visible during use and not during the curation of the training data.

Ideally, therefore, large AI models should be able to be corrected by developers and/or users in such a way that the model remains intact except for the changes.

However, such targeted corrections are extremely difficult. Researchers at Stanford University have now published a paper that addresses this problem. They say the decentralized black-box nature of the representations learned by large neural networks is the biggest obstacle. Previously known solutions for smaller neural networks do not translate to large models, the team says.

If presented with only a single problematic input and new desired output, fine-tuning approaches tend to overfit; other editing algorithms are either computationally infeasible or simply ineffective when applied to very large models.

From the paper

MEND relies on small auxiliary networks

A successful modification would have to meet three conditions: reliability, locality, and generality.

It must reliably change the output for a previously problematic input ("Who is the prime minister of the UK?"),
while minimally affecting the model's output for unrelated inputs ("What sports team does Messi play for?"),
and at the same time producing correct output for related inputs ("Who is the UK PM?").

As a solution, the team proposes "Model Editor Networks with Gradient Decomposition" (MEND). Instead of fine-tuning a large model directly, MEND trains small model editor networks (multilayer perceptrons) that make changes to the weights of the large model. To do this, they use a low-dimensional representation of the fine-tuning gradient of each correction.

Video: Mitchell et al.

Recommendation

AI research

Study reveals AI models have hidden capabilities they can't access through normal prompts

In their paper, the researchers show that MEND can be trained for large models with more than ten billion parameters within one day on a single GPU. It can then perform quick edits on T5, GPT, BERT, or BART models. The team also shows that MEND is likely to scale to model sizes of hundreds of billions of parameters.

For more information, see the MEND project page. The code is available on GitHub.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

A new method can correct and update large AI models

Large AI models must become updatable and correctable

MEND relies on small auxiliary networks

Study reveals AI models have hidden capabilities they can't access through normal prompts

Google DeepMind open-sources AI text watermarking for Gemini

Microsoft's RUBICON tells if your AI coding buddy is actually helping or just slacking off

Language models like GPT-4 memorize more than they reason, study finds

Cloudflare CEO Matthew Prince sees trouble ahead for the open web

New Othello experiment supports the world model hypothesis for large language models

ChatGPT might be draining your brain, MIT warns - what ‘cognitive debt’ means for you

A new method can correct and update large AI models

Large AI models must become updatable and correctable

MEND relies on small auxiliary networks

Share

Bank details