BTLM-3B-8k-base brings LLM capabilities to devices with just 3GB of memory

Aug 2, 2023

Cerebras and Opentensor have trained a powerful 3 billion parameter language model with an 8k context length window, called BTLM-3B-8k-base, on the Condor Galaxy 1 (CG-1) supercomputer. This new model outperforms similar models, achieves performance comparable to open 7B parameter models, can be quantized to fit on devices with as little as 3 GB of memory, and is licensed for commercial use. It requires 71% fewer training FLOPs and has a 58% smaller memory footprint for inference than comparable 7B models.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Hugging Face