AI research

Meta's Humpback: pushing the boundaries of open source LLMs through self-alignment

Maximilian Schreiner

Midjourney prompted by THE DECODER

Meta developed a method for large language models to iteratively improve their ability to follow instructions, without relying on human annotation or distillation from more powerful models.

Meta's research proposes a new technique called "instruction backtranslation" that allows large language models like LLaMa to be fine-tuned to follow instructions without relying on expensive human annotations or distillation from more powerful models like GPT-4.

Instruction backtranslation is the self-play of instruction tuning

Instruction backtranslation is a two-step process combining self-augmentation and self-curation. In the self-augmentation phase, the language model is used to generate candidate instruction-response pairs from the unlabeled text corpus. For each unlabeled text, the model tries to predict what instruction would elicit that response. This results in a large set of synthesized examples.

The self-curation phase then uses the model to score these candidate pairs and filter out low-quality ones. The model ranks the examples and keeps only the highest-scoring subset. These steps of generating candidates and curating the best data are repeated. Each iteration produces a better model that can in turn improve the quality of the data it selects for the next round.

Through this iterative self-training process, the model learns to generate better instructions and also becomes better at discriminating high-quality demonstration examples.

Meta's Humpback model beats Anthropics Claude in instruction-following benchmarks

Metas researchers show that this approach leads to strong instruction-tracking performance, outperforming previous work using the same scale LLaMa model. The resulting model Humpback 65B achieves state-of-the-art results among non-distilled LLaMa methods on the Alpaca instruction-following benchmark, surpassing the performance of models such as Anthropics Claude, Guanaco, LIMA, and Falcon-Instruct.

In future work, the team plans to further scale this method "by considering larger unlabeled corpora, which our analysis suggests should yield further gains.

Sources: