Content
summary Summary

EleutherAI is one of the pioneers of open-source research in generative AI, especially in language models. It's now becoming a not-for-profit organization with full-time researchers.

Ad

The EleutherAI research collective is professionalizing. Over the past two and a half years, it has evolved from a group of programmers on Discord to what it calls an open science community. Now, EleutherAI is becoming a non-profit research institute, according to its blog. 20 scientists can now work full-time for EleutherAI.

EleutherAI members have authored 28 papers, trained dozens of models, and released ten codebases in the past 18 months, including

  • the open-source LLM GPT-NeoX-20B
  • the VQGAN-CLIP image model
  • the 825 GB text training dataset "The Pile"

A complete list of the scientific papers including links to the papers and a list of all participants can be found here. EleutherAI was also involved in the development of Stable Diffusion.

Ad
Ad

Leadership positions are held by Stella Rose Biderman as Executive Director and Head of Research, Curtis Huebner as Head of Alignment, and Shiv Purohit as Head of Engineering.

The organization was previously led by Connor Leahy, who will now focus on his AI alignment projects for AGI. Several other former members are also focusing on their own projects.

AI breakthroughs don't happen on the side

Funders include AI company Stability AI, code repository Hugging Face, GPU cloud operators CoreWeave and Lambda, former GitHub CEO Nat Friedman, and image editor Canva.

It has become abundantly clear that the biggest blocker in what we could be accomplishing is the fact that working a forty-hour workweek and doing cutting-edge AI research on the side is unsustainable for most contributors.

EleutherAI

The world has changed a lot since the collective was founded, EleutherAI said. The world's largest open-source GPT-3-style language model (probably a small variant of GPT-2) had 1.5 billion parameters at the time (today's models have hundreds of billions of parameters). GPT-3 itself was only available to selected researchers.

In addition, most NLP researchers had a very limited understanding of the technique required to train such models, as well as their capabilities and limitations. "We started as a ragtag group nobody had heard of, and within a year had released the largest OSS GPT-3-style model in the world."

Recommendation

New focus on AI interpretability, alignment, ethics, and evaluation

Instead of developing new models, the researchers now plan to focus on other areas of AI development for which they would have originally trained their own models:

As access to LLMs has increased, our research has shifted to focus more on interpretability, alignment, ethics, and evaluation of AIs. We look forward to continuing to grow and adapt to the needs of researchers and the public.

EleutherAI

In addition to commercial companies like Google, Microsoft, and OpenAI that only partially publish their work, non-profit organizations like EleutherAI represent a counter-movement in the AI landscape. LAION or OpenBioML are also pursuing similar efforts for open AI science.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • EleutherAI will have at least twenty full-time employees working on AI topics such as interpretability, alignment, and ethics.
  • The research collective came together over Discord about two and a half years ago.
  • Among other things, it is responsible for one of the most promising open source alternatives to GPT-3.
Sources
Jonathan works as a freelance tech journalist for THE DECODER, focusing on AI tools and how GenAI can be used in everyday work.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.