Meta Platforms plans to release the largest version of its open-source language model Llama 3 on July 23, according to an employee.
The 405-billion-parameter model will be multimodal, capable of processing both images and text, reports The Information. This means that the model should be able to generate new images from a combination of images and text, for example. Previous Lama models were limited to text generation.
There were rumors that Meta would not make the weights of the 400 billion model available. AI leaker Jimmy Apples reported on X about alleged objections from Facebook co-founder Dustin Moskovitz to Mark Zuckerberg.
Despite these objections, Meta "apparently at the moment of this update" decided to publish the model, including the weights, as open source, according to Jimmy Apples.
There are financial reasons against releasing the weights - model training is expensive - and safety reasons. With weights, the open-source model is easier for more people to use directly, which can be criticized from a safety perspective.
Weights in AI models are key parameters for optimizing predictions. Publishing them in open-source models enables reproducibility and facilitates practical application, transparency, and comparability.
When developers download a pre-trained model without weights, they get only the architecture of the model, the "empty shell" so to speak. This architecture defines the structure of the neural network - how many layers it has, how they are connected, and so on.
Without the weights, which are optimized during the learning process, the model cannot make meaningful predictions or solve tasks. Depending on the size of the model and the amount of data, the training process can be time-consuming and resource intensive.
Access to weights allows developers without massive training capabilities to use and develop advanced AI models. This is why weights are so important and coveted in the open-source AI community.