Run LLMs on your M Series with Apple's new MLX machine learning framework

Dec 12, 2023

Midjourney prompted by THE DECODER

Update – Dec 12, 2023

You can now run Mistral's new Mixture-of-Experts model in MLX on Apple silicon.

Apple has released MLX, an efficient machine learning framework tailored for Apple silicon, and MLX Data, a flexible data loading package.

Both have been released by Apple's machine learning research team. MLX's Python API closely follows NumPy, with a few differences.

Composable function transformations: MLX has composable function transformations for automatic differentiation, automatic vectorization, and computation graph optimization.

Lazy computation: Computations in MLX are lazy. Arrays are only materialized when needed.

Multi-device: Operations can run on any of the supported devices (CPU, GPU, …)

The design of MLX is inspired by frameworks such as PyTorch, Jax, and ArrayFire. A notable difference between these frameworks and MLX is the unified memory model, Apple writes. Arrays in MLX live in shared memory, allowing operations on MLX arrays to be performed on any supported device type without performing data copies. MLX Data (Github) is a framework-agnostic and flexible data-loading package.

Run Mistral and Llama on your M2 Ultra

With MLX and MLX Data, users can perform tasks such as training a Transformer language model or fine-tuning with LoRA, text generation with Mistral, image generation with Stable Diffusion, and speech recognition with Whisper. For an example of how to get started with MLX and Mistral, see this tutorial.

The following video shows the performance of a Llama v1 7B model implemented in MLX and running on an M2 Ultra, highlighting the capabilities of MLX on Apple Silicon devices.

Video: Awni Hannun via Twitter.com

For details, see the MLX Github and Apple's documentation.

So far, Apple has mostly talked publicly about "machine learning" and how it's implementing ML features in its products, such as better word prediction for its iPhone keyboard.

Apple's move now with MLX is interesting in that it potentially strengthens the open-source AI movement built around models like Meta's Llama, Mistral, and Stable Diffusion.

But it's also reportedly working internally on an LLM framework called Ajax and its own chatbot, and is spending millions of dollars a day on AI training to keep up with ChatGPT and generative AI services in general.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

Run LLMs on your M Series with Apple's new MLX machine learning framework

Update – Dec 12, 2023

Run Mistral and Llama on your M2 Ultra

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.