AI research

Large language models: OpenAI CEO sees 'end of an era' in number of parameters

Matthias Bastian
A visually arresting image for a magazine article that depicts the enormous number of parameters of a large language model as an intricate, interconnected network. Design a complex network of nodes and connections in the center of the image on a white or light background. Each node represents a parameter, and the links represent the relationships between the parameters. Use different colors for the nodes and a single, subtle color for the links (for example, gray or light blue). Arrange the nodes so that they form an interesting, organic shape, such as a brain or an abstract cloud. The overall theme should evoke a sense of complexity and interconnectedness, reflecting the immense size and structure of the parameters of the large language model.

Midjourney prompted by THE DECODER

In recent years, the potential progress of large language models has been measured primarily by the number of parameters. Sam Altman, CEO of OpenAI, believes that this practice is no longer useful.

Altman compares the race to increase the number of parameters in large language models to the race to increase the clock speed of chips in the 1990s and 2000s, where the focus was only on clock speed. Today, for example, the clock speed of smartphone chips is almost irrelevant, even though these chips are much more powerful than earlier processors, Altman said.

The OpenAI CEO no longer sees the number of parameters as a good sole indicator of a model's performance. "I think we’re at the end of the era where it’s gonna be these giant models, and we’ll make them better in other ways," Altman said at the Imagination in Action event (via Techcrunch), where he also commented on GPT-5 and the AI Pause letter.

Focus on capabilities

Still, the number of parameters could continue to grow, Altman said. But the focus needs to be on improving and expanding the capabilities of the models, not the number of parameters. Possible future architectures, for example, could consist of several smaller models working together.

Altman has said in the past that future AI models should be distinguished by their efficiency and data quality rather than their sheer number of parameters. Models such as Deepmind's Chinchilla, Aleph Alpha's Sparse Luminous Base, and Meta's LLaMA models show that language models with fewer parameters can keep up with larger models through more efficient architecture or more data training.

In the past, OpenAI has always disclosed the number of parameters in its models, but GPT-4 is the first time the company hasn't done so. Semafor reports that GPT-4 has one trillion parameters, about six times as many as GPT-3. This number has not been confirmed by any other source. When asked, Semafor journalist Reed Albergotti would not comment specifically on the source of the number or its accuracy, referring to possible additional reporting.

Sources: