Nvidia releases free LLMs that match GPT-4 in some benchmarks

Nvidia has released Nemotron-4 340B, an open-source pipeline for generating synthetic data. The language model is designed to help developers create high-quality datasets for training and fine-tuning large language models (LLMs) for commercial applications.

The Nemotron-4 340B family consists of a base model, an instruction model, and a reward model, which together form a pipeline for generating synthetic data that can be used to train and refine LLMs. Nemotron's base model was trained with 9 trillion tokens.

Synthetic data mimics the properties of real data and can improve data quality and quantity, which is particularly important when access to large, diverse, and annotated datasets is limited.

According to Nvidia, the Nemotron-4 340B Instruct model generates diverse synthetic data that can improve the performance and robustness of customized LLMs in various application areas such as healthcare, finance, manufacturing, and retail.

The Nemotron-4 340B Reward model can further improve the quality of the AI-generated data by filtering out high-quality responses.

Nemotron-4 340B Instruct first generates domain-specific, synthetic training texts. The second model, Nemotron-4 340B Reward, then evaluates these generated texts and provides feedback to gradually improve them. The interaction between the two models produces higher-quality training data over time. | Image: Nvidia

98 percent of the training data used to fine-tune the Instruct model is synthetic and was created using Nvidia's pipeline.

In benchmarks such as MT-Bench, MMLU, GSM8K, HumanEval, and IFEval, the Instruct model generally performs better than other open-source models such as Llama-3-70B-Instruct, Mixtral-8x22B-Instruct-v0.1, and Qwen-2-72B-Instruct, and in some tests, it even outperforms GPT-4o.

The three Nemo models are among the top open models, however the model boasts significant more parameters, which might make it less efficient in comparison. | Picture: Nvidia

It also performs comparable to or better than OpenAI's GPT-4-1106 in human evaluation for various text tasks such as summaries and brainstorming. Detailed benchmarks are available in the technical report. According to Nvidia, the models run on DGX H100 systems with eight GPUs at FP8 precision.

Nvidia's Nemotron 340B-Instruct model is on par in text task benchmarks with GPT-4 1106. | Image: Nvidia

The models are optimized for inference with the open-source framework Nvidia NeMo and the Nvidia TensorRT-LLM library. Nvidia makes them available under its Open Model License, which also allows for commercial use. All data is available on Huggingface.

Recommendation

AI in practice

Unintentional AI leak from Mistral becomes an unexpected powerhouse

Releasing Nemotron framed as a synthetic data generator seems to be a very strategic move by Nvidia: instead of positioning Nemotron as a competitor to Llama 3 or GPT-4, the model family is supposed to help other developers to train better or more models in different domains. More training and more models on the market means more demand for GPUs.

Nvidia releases free LLMs that match GPT-4 in some benchmarks

Unintentional AI leak from Mistral becomes an unexpected powerhouse

Nvidia's Blackwell can train GPT-4 in 10 days, but does this solve current models' problems?

U.S. antitrust regulators investigate Microsoft, OpenAI and Nvidia

Nvidia shows AI-powered game assistants and NPCs coming to RTX GPUs with local AI inference

AI that defeated humans at Go could now help language models master mathematics

TransNAR: Neural Algorithmic Reasoners bring robust computation to transformers

"Artificial Generational Intelligence": AI agents learn from each other across generations

Nvidia releases free LLMs that match GPT-4 in some benchmarks

Share

Bank details