Content
summary Summary

Nvidia and its partners have announced a competition to advance the development of hardware using large language models (LLMs).

Ad

According to Nvidia, current LLMs like GPT-4 still struggle to generate practical hardware designs without human intervention. This is primarily because the models lack sufficient hardware-specific code during training.

The competition aims to create a comprehensive, high-quality open-source dataset of Verilog code for training LLMs. The goal is to spark an "ImageNet-like revolution in LLM-based hardware code generation," as stated on the competition website.

Nvidia researcher Jim Fan says Nvidia is "very interested" in automating the design process for its next-generation GPUs. The company believes that by building better GPUs, they can achieve more intelligence per unit of training time. This increased intelligence will then lead to improved coding by LLMs, which will ultimately enable the design of even more advanced GPUs.

Ad
Ad

"Some day, we can take a vacation and NVIDIA will still keep shipping new chips. Time to kickstart a self-bootstrapping, exponential loop that iterates over both hardware and models," Fan writes.

The competition has two phases. In the first phase, participants should collect or generate Verilog code examples to expand the existing MG Verilog dataset, focusing on scalable methods.

In the second phase, participants will receive the complete dataset with all samples submitted in phase one. They will work on improving the dataset's quality through data cleansing and label generation, emphasizing automated methods.

Contributions will be judged based on the improvement the submitted data brings to a fine-tuned CodeLlama 7B-Instruct model. Nvidia will provide contestants with a starter kit that includes a base dataset, sample data, and code to fine-tune the LLM.

Registration for the competition must be completed by the end of July at the latest. The results will be presented at the International Conference on Computer-Aided Design at the end of October.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Nvidia and partners have launched a competition to advance the development of hardware using large language models (LLMs), aiming to create a comprehensive open-source dataset of Verilog code for training LLMs.
  • The competition seeks to address the current limitations of LLMs in generating practical hardware designs without human intervention, primarily due to insufficient hardware-specific code during training.
  • The competition has two phases: the first phase focuses on collecting or generating Verilog code examples to expand the existing MG Verilog dataset, while the second phase involves improving the dataset's quality through data cleansing and label generation. Registration closes at the end of July and results will be presented at a conference in late October.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.