AI in practice

Aug 2, 2023

BTLM-3B-8k-base brings LLM capabilities to devices with just 3GB of memory

Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.

Profile

E-Mail

Cerebras and Opentensor have trained a powerful 3 billion parameter language model with an 8k context length window, called BTLM-3B-8k-base, on the Condor Galaxy 1 (CG-1) supercomputer. This new model outperforms similar models, achieves performance comparable to open 7B parameter models, can be quantized to fit on devices with as little as 3 GB of memory, and is licensed for commercial use. It requires 71% fewer training FLOPs and has a 58% smaller memory footprint for inference than comparable 7B models.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:

Bank transfer

Sources

Hugging Face

Matthias Bastian

Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.

Profile

E-Mail

AI and society

Oct 23, 2024Oct 23, 2024

Google DeepMind open-sources AI text watermarking for Gemini

News, tests and reports about VR, AR and MIXED Reality.

What happens next with MIXED My personal farewell to MIXED Meta and Anduril are now jointly developing XR headsets for the US military MIXED-NEWS.com

AI research

Aug 11, 2024

Microsoft's RUBICON tells if your AI coding buddy is actually helping or just slacking off

AI research

Jul 14, 2024Jul 14, 2024

Language models like GPT-4 memorize more than they reason, study finds

Google News

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

BTLM-3B-8k-base brings LLM capabilities to devices with just 3GB of memory

Google DeepMind open-sources AI text watermarking for Gemini

Microsoft's RUBICON tells if your AI coding buddy is actually helping or just slacking off

Language models like GPT-4 memorize more than they reason, study finds

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

BTLM-3B-8k-base brings LLM capabilities to devices with just 3GB of memory

Google DeepMind open-sources AI text watermarking for Gemini

Microsoft's RUBICON tells if your AI coding buddy is actually helping or just slacking off

Language models like GPT-4 memorize more than they reason, study finds