The CarperAI research lab plans to release a large GPT-3 level language model trained with human feedback.
CarperAI, along with partners EleutherAI, Scale AI, Multi, Humanloop and Hugging Face, plans to release a chinchilla-optimal large language model. The model will be explicitly trained to better follow human instructions and will be released as open source.
"The open-source release is crucial for enabling academics, independent researchers, and startups to conduct science and build upon state-of-the-art models," the team writes.
Instruct-GPT: First open-source AI model trained with human feedback
For the training, CarperAI relies on reinforcement learning with human feedback, a method that OpenAI, among others, uses in GPT-3 for InstructGPT models, human-optimized derivatives from the larger GPT-3 model.
Humans rate the output of these models better, even though the models themselves are significantly smaller and thus more efficient to run. OpenAI sees human feedback in the AI training process as an important safety component in AI alignment. Deepmind has also used this technique for its latest chatbot.
CarperAI and partners likewise see training with human feedback as an essential step for implementing large language models in everyday life.
"The risks of LLMs have been well documented and range from spreading misinformation to reinforcing social biases. Compared to standard language models, training with RLHF dramatically reduces these risks and, at the same time, increases the model's usefulness," the researchers write.
CarperAI is a lab of the EleutherAI research collective, which has previously published large-scale language models, most recently GPT-NeoX-20B, which approaches GPT-3 in some benchmarks. The team is tracking down volunteers to support the Instruct-GPT project. More information is available on the project website, Discord channel and via Github.