Content
summary Summary

A new AI model is making waves in the open-source scene. Now it's clear: it's a leak of an older Mistral model.

A few days ago the AI model "miqu-1-70b" appeared on HuggingFace and finally on 4chan. The files of the AI model "miqu-1-70b" were first posted on HuggingFace by a user named "Miqu Dev". On the same day, an anonymous user posted a link to the files on 4chan. The CEO of Mistral, Arthur Mensch, confirmed yesterday that it was a leaked model from his company.

After the leak on 4chan, the model quickly attracted attention: first tests in the community show that the performance of the model is equal to or even better than Mixtral, Mistral's strongest open-source model to date, in most tests. In some tests, it even outperformed the strongest Mistral Medium model, and in one benchmark it even beat all language models except GPT-4.

Image: Twitter.com

Miqu-1-70B is an old Mistral model based on Llama-2

After much speculation, Mistral CEO Arthur Mensch confirmed on X yesterday that an "over-enthusiastic employee" of an early access customer had released a quantized and watermarked version of an old model trained by Mistral. According to Mensch, this is an old model that the company trained based on Meta's Llama 2. The pre-training of the model was completed at the time of the release of Mistral-7B, the company's first language model. Some had speculated that Mistral itself had leaked the model after the Paris-based AI startup first released its latest Mixtral model via a torrent.

Ad
Ad
Image: Twitter.com

The company appears to have no plans to remove the model from HuggingFace - whether an official release with licenses is planned is unknown. Interestingly, Mensch responded to the HuggingFace post, not by asking for it to be removed, but by jokingly suggesting that the poster "might consider attribution".

According to Mensch, the company has made great progress since the development of the leaked model - so a model on par with GPT-4 is probably to be expected with Mistral-Large.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • An AI model called "miqu-1-70b" recently surfaced on HuggingFace and 4chan, and quickly caught the attention of the open source community due to its high performance.
  • Mistral's CEO, Arthur Mensch, has now confirmed that "miqu-1-70b" is a leaked older model from his company that was trained on Meta's Llama 2.
  • Although the model was released without an official license, Mistral has no plans to remove it from HuggingFace.
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.