Content
summary Summary

Apple released OpenELM, a set of open-source language models optimized for efficiency and better performance with less training data.

OpenELM (Open-source Efficient Language Models) is a family of open-source language models with up to three billion parameters. The models use a layer-wise scaling strategy that more efficiently distributes parameters within the transformer model layers, the researchers write. As a result, OpenELM achieves higher accuracy than comparable models.

The OpenELM model with 1.1 billion parameters outperforms AI21 Labs' OLMo model with 1.2 billion parameters by 2.36 percent, despite using half as many training tokens for pre-training. Simply put, OpenELM achieves slightly better performance with less data and compute.

OpenELM requires less data for better performance compared to same-class models. | Bild: Mehta et al.

The OpenELM models come in four sizes: 270 million, 450 million, 1.1 billion and 3 billion parameters. All models are also available in a version that has been fine-tuned with instructions. They are available on Github and Huggingface.

Ad
Ad

Apple provides the entire training and fine-tuning framework as open source. This includes the training protocol, multiple checkpoints, and pre-training configuration. In addition, Apple releases code to convert the models to the MLX library to enable inference and tuning on Apple devices.

For training, the tech company used publicly available datasets such as RefinedWeb, deduplicated versions of The PILE, parts of RedPajama and Dolma 1.6. In total, the training dataset contains approximately 1.8 trillion tokens.

Safe, private, on-device

OpenELM is likely another building block in Apple's AI strategy, which focuses on privacy, efficiency, and control, with generative AI primarily on the device.

This could mean improvements to the Siri voice assistant or new generative AI features in apps like Mail or News. Apple wants to show that it can build leading AI systems without exploiting user data.

For advanced cloud AI applications, Apple could partner with Google, OpenAI, and others. Details of Apple's generative AI strategy are expected to be announced at its WWDC developer conference, which begins June 10.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Apple releases OpenELM, a set of open-source language models that outperform comparable models with less training data and computation time.
  • The OpenELM models are available in four sizes, from 270 million to 3 billion parameters, each of which is also available in an instruction-optimized version. Apple is making the entire framework, including the training protocol, checkpoints and configuration, available as open source.
  • OpenELM is likely to be part of Apple's AI strategy, which focuses on privacy, efficiency and control, and aims to deliver generative AI primarily on users' devices.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.