Ad
Skip to content

Nvidia releases free LLMs that match GPT-4 in some benchmarks

Image description
Nvidia

Key Points

  • Nvidia releases Nemotron-4 340B, a free pipeline that generates high-quality synthetic data for training and tuning large-scale language models (LLMs). It can be used for commercial applications.
  • The Nemotron-4 340B family consists of a base model trained on 9 trillion tokens, an instruction model for generating diverse synthetic data, and a reward model for filtering high-quality responses.
  • In benchmarks, the instruction model typically outperforms other open-source and -weights models, and in some cases outperforms GPT-4. Nvidia also makes the models available for commercial use under an open model license.

Nvidia has released Nemotron-4 340B, an open-source pipeline for generating synthetic data. The language model is designed to help developers create high-quality datasets for training and fine-tuning large language models (LLMs) for commercial applications.

The Nemotron-4 340B family consists of a base model, an instruction model, and a reward model, which together form a pipeline for generating synthetic data that can be used to train and refine LLMs. Nemotron's base model was trained with 9 trillion tokens.

Synthetic data mimics the properties of real data and can improve data quality and quantity, which is particularly important when access to large, diverse, and annotated datasets is limited.

According to Nvidia, the Nemotron-4 340B Instruct model generates diverse synthetic data that can improve the performance and robustness of customized LLMs in various application areas such as healthcare, finance, manufacturing, and retail.

Ad
DEC_D_Incontent-1

The Nemotron-4 340B Reward model can further improve the quality of the AI-generated data by filtering out high-quality responses.

Nemotron-4 340B Instruct first generates domain-specific, synthetic training texts. The second model, Nemotron-4 340B Reward, then evaluates these generated texts and provides feedback to gradually improve them. The interaction between the two models produces higher-quality training data over time. | Image: Nvidia

98 percent of the training data used to fine-tune the Instruct model is synthetic and was created using Nvidia's pipeline.

In benchmarks such as MT-Bench, MMLU, GSM8K, HumanEval, and IFEval, the Instruct model generally performs better than other open-source models such as Llama-3-70B-Instruct, Mixtral-8x22B-Instruct-v0.1, and Qwen-2-72B-Instruct, and in some tests, it even outperforms GPT-4o.

The three Nemo models are among the top open models, however the model boasts significant more parameters, which might make it less efficient in comparison. | Picture: Nvidia

It also performs comparable to or better than OpenAI's GPT-4-1106 in human evaluation for various text tasks such as summaries and brainstorming. Detailed benchmarks are available in the technical report. According to Nvidia, the models run on DGX H100 systems with eight GPUs at FP8 precision.

Ad
DEC_D_Incontent-2

Nvidia's Nemotron 340B-Instruct model is on par in text task benchmarks with GPT-4 1106. | Image: Nvidia

The models are optimized for inference with the open-source framework Nvidia NeMo and the Nvidia TensorRT-LLM library. Nvidia makes them available under its Open Model License, which also allows for commercial use. All data is available on Huggingface.

Releasing Nemotron framed as a synthetic data generator seems to be a very strategic move by Nvidia: instead of positioning Nemotron as a competitor to Llama 3 or GPT-4, the model family is supposed to help other developers to train better or more models in different domains. More training and more models on the market means more demand for GPUs.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Nvidia | Paper | Hugging Face