Nvidia releases free LLMs that match GPT-4 in some benchmarks

Nvidia has released Nemotron-4 340B, an open-source pipeline for generating synthetic data. The language model is designed to help developers create high-quality datasets for training and fine-tuning large language models (LLMs) for commercial applications.

The Nemotron-4 340B family consists of a base model, an instruction model, and a reward model, which together form a pipeline for generating synthetic data that can be used to train and refine LLMs. Nemotron's base model was trained with 9 trillion tokens.

Synthetic data mimics the properties of real data and can improve data quality and quantity, which is particularly important when access to large, diverse, and annotated datasets is limited.

According to Nvidia, the Nemotron-4 340B Instruct model generates diverse synthetic data that can improve the performance and robustness of customized LLMs in various application areas such as healthcare, finance, manufacturing, and retail.

The Nemotron-4 340B Reward model can further improve the quality of the AI-generated data by filtering out high-quality responses.

Nemotron-4 340B Instruct first generates domain-specific, synthetic training texts. The second model, Nemotron-4 340B Reward, then evaluates these generated texts and provides feedback to gradually improve them. The interaction between the two models produces higher-quality training data over time. | Image: Nvidia

98 percent of the training data used to fine-tune the Instruct model is synthetic and was created using Nvidia's pipeline.

In benchmarks such as MT-Bench, MMLU, GSM8K, HumanEval, and IFEval, the Instruct model generally performs better than other open-source models such as Llama-3-70B-Instruct, Mixtral-8x22B-Instruct-v0.1, and Qwen-2-72B-Instruct, and in some tests, it even outperforms GPT-4o.

The three Nemo models are among the top open models, however the model boasts significant more parameters, which might make it less efficient in comparison. | Picture: Nvidia

It also performs comparable to or better than OpenAI's GPT-4-1106 in human evaluation for various text tasks such as summaries and brainstorming. Detailed benchmarks are available in the technical report. According to Nvidia, the models run on DGX H100 systems with eight GPUs at FP8 precision.

Nvidia's Nemotron 340B-Instruct model is on par in text task benchmarks with GPT-4 1106. | Image: Nvidia

The models are optimized for inference with the open-source framework Nvidia NeMo and the Nvidia TensorRT-LLM library. Nvidia makes them available under its Open Model License, which also allows for commercial use. All data is available on Huggingface.

Recommendation

AI in practice

Deepseek’s first hybrid model V3.1 surpasses its R1 reasoning model on benchmarks

Releasing Nemotron framed as a synthetic data generator seems to be a very strategic move by Nvidia: instead of positioning Nemotron as a competitor to Llama 3 or GPT-4, the model family is supposed to help other developers to train better or more models in different domains. More training and more models on the market means more demand for GPUs.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Nvidia releases free LLMs that match GPT-4 in some benchmarks

Deepseek’s first hybrid model V3.1 surpasses its R1 reasoning model on benchmarks

Qualcomm enters data center market with new AI accelerator chips

Early reviews suggest Nvidia may have found another way to sell its chips with the DGX Spark

USA approves billion-dollar Nvidia AI chip exports

OpenAI restructures under new foundation, Microsoft takes 27 percent stake

ChatGPT's memory could turn personal details into ads OpenAI CEO Altman once called dystopian

The long-predicted deepfake dystopia has arrived with Sora 2

Nvidia releases free LLMs that match GPT-4 in some benchmarks

Share

Bank details