Databricks has introduced DBRX, a new powerful open language model that outperforms GPT-3.5, Grok, Mixtral, and Llama 2. The company aims to drive the trend toward greater transparency and open models in the AI industry.
Technology company Databricks has released DBRX, a new open language model that it claims outperforms established open-source models. In standardized benchmark tests, DBRX outperformed Meta's Llama 2, Anthropic's Mixtral, and even the recently released Grok-1 model from Elon Musk's xAI. DBRX also outperformed OpenAI's GPT 3.5 model on most benchmarks.
In composite benchmarks such as the Hugging Face Open LLM Leaderboard and the Databricks Model Gauntlet, DBRX achieved the best results of any model tested. DBRX also has great performance in areas such as programming and math.
According to Databricks, DBRX is even close in quality to GPT-4, currently OpenAI's most powerful closed language model.
However, the model is not completely open source - it comes with a license that sets rules for use and the training data is not available. It is more likely to be classified as an open model, similar to Meta's Llama 2, which is not open source according to open-source watchdogs. Private and commercial use is allowed.
In addition, the team has not tested DBRX against other models such as Alibaba's QWen1.5, which according to benchmarks outperforms the new model, at least in the MMLU benchmark.
DBRX relies on Mixture-of-Experts
DBRX is a mixture-of-experts model with 132 billion parameters, of which only 36 billion are active at any given time, allowing for high efficiency in terms of tokens per second. The model was trained on 3,072 Nvidia H100 GPUs with 12 trillion tokens of text and code with a maximum context window of 32,000 tokens. The combination of high data quality and adjustments to the model architecture to improve hardware utilization led to an increase in training efficiency of up to 50 percent, according to the announcement.
In addition, Databricks allows customers to use and customize DBRX on the Databricks platform and train their own models on private data. The open-source community has access to DBRX through the Databricks GitHub repository and Hugging Face.
By taking an open approach, Databricks aims to foster innovation in generative AI and bring more transparency to the development of AI models. The company emphasizes that open LLMs are becoming increasingly important as companies increasingly replace proprietary models with customizable open-source models to achieve greater efficiency and control. Databricks believes that open models like DBRX can help companies become more competitive in their respective industries.
Databricks provides two variants, DBRX Base and DBRX Instruct. The company acquired MosaicML in 2023, whose team released powerful, open language models early on with the MPT models.