AI in practice

GPT-4 has more than a trillion parameters - Report

Matthias Bastian
an artistic representation of a neural network parameter by visualizing the weights and biases as a heatmap. Show a grid of interconnected neurons, where the color of each connection represents the weight value, with warmer colors indicating higher weights and cooler colors indicating lower weights. Additionally, illustrate the biases as colored circles on the neurons, with warmer colors representing higher biases and cooler colors representing lower biases

Midjourney prompted by GPT-4

Update
  • Further details on GPT-4's size and architecture have been leaked.
  • The system is said to be based on eight models with 220 billion parameters each, for a total of about 1.76 trillion parameters, connected by a Mixture of Experts (MoE).

GPT-4 is reportedly six times larger than GPT-3, according to a media report, and Elon Musk's exit from OpenAI has cleared the way for Microsoft.

The US website Semafor, citing eight anonymous sources familiar with the matter, reports that OpenAI's new GPT-4 language model has one trillion parameters. Its predecessor, GPT-3, has 175 billion parameters.

Semafor previously revealed Microsoft's $10 billion investment in OpenAI and the integration of GPT-4 into Bing in January and February, respectively, before the official announcement.

The number of parameters in a language model is an indicator of its ability to recognize complex patterns and relationships in data and develop new emergent capabilities. Parameters determine the relationship between input and output through weights. The weights are learned during training.

In addition to model size, data quality and the amount of data trained are critical to AI performance. For example, Deepmind Chinchilla has shown that an AI system with only 70 billion parameters, but more extensive data training, can outperform much larger systems such as GPT-3.

When announcing GPT-4, OpenAI refrained from disclosing the number of parameters and the training data used, which was met with criticism. OpenAI cited the "competitive environment" as the reason for the secrecy.

Elon Musk cleared the way for Microsoft

Elon Musk's 2018 departure from OpenAI cleared the way for Microsoft, Semafor reports. Musk was one of the co-founders of OpenAI in 2015, and one of its backers. He reportedly warned OpenAI CEO Sam Altman, also a co-founder, in 2018 that OpenAI was falling behind Google's AI development. Musk wanted to take over OpenAI and run it himself, but Altman and the other OpenAI co-founders refused.

Musk then resigned from OpenAI in February 2018, canceling a planned major donation. Officially, OpenAI said at the time that Musk's departure from the board had to do with avoiding conflicts of interest in his role as CEO of Tesla, which is also researching AI. Andrej Karpathy, the developer of Tesla's autonomous driving system, left OpenAI for Tesla, but has since returned to OpenAI.

Musk is said to have made no further payments since leaving OpenAI. In total, he contributed $100 million of a planned $1 billion. Sam Altman, who reportedly has no equity in OpenAI, became president of OpenAI after Musk's departure.

Google's discovery of Transformer networks in 2017 created new opportunities in AI development, but it also further increased the cost of AI training. To keep pace, OpenAI, originally a nonprofit startup, announced a new for-profit structure in March 2019. Microsoft's initial $1 billion investment followed six months later.

After the huge success of ChatGPT, Musk reportedly revoked OpenAI's access to Twitter data, which OpenAI had contractually secured before Musk joined Twitter. Musk publicly criticized OpenAI's evolution from an open-source "non-profit" organization to a closed corporation allegedly controlled by Microsoft and focused on maximizing profits.

Elon Musk is said to be working on an alternative to ChatGPT. He specifically raised concerns about possible biased social or political views in ChatGPT, which is heavily restricted by OpenAI. OpenAI announced that in the future, customizable ChatGPT models could represent a greater variety of perspectives.

Sources: