Jais is an open ChatGPT alternative for Arabic

Sep 3, 2023

Inception

Jais is a large language model focused on Arabic and is currently the best open model of its kind.

Researchers from the United Arab Emirates, in collaboration with Cerebras, introduce two new open language models: Jais and Jais-chat. The models were trained on Arabic and English language and code, and significantly outperform existing open-source models for Arabic.

Jais is a 13 billion parameter model pre-trained with 395 billion tokens, of which 116 billion are Arabic tokens. Jais chat has been instruction tuned with an additional 10 million instruction/response pairs and outperforms all existing open Arabic/multilingual chatbots.

The models are the first Arabic-centric open models of this scale.

Jais can match ChatGPT in some tasks

Arabic websites, books, news, and Wikipedia were used as training data, with all data filtered before training. The 232 billion tokens of English data from The Pile by EleutherAI are used to compensate for the limited Arabic data available. The team also uses 46 billion code tokens.

In benchmarks, Jais and Jais-chat outperform existing, freely available Arabic models by 11 to 15 points in accuracy, and are competitive with Meta's LLaMa2 for English, according to the team. Commercial models such as OpenAI's ChatGPT or Anthropic's Claude are still ahead on average in the benchmarks, but are also significantly larger. However, for some tasks, such as writing, Jais and Jais-chat are on par with ChatGPT, the team said.

The team also provides a number of other security mechanisms for Jais-chat, such as filters and classifiers for unwanted requests and output.

Another special feature of the model: it was not trained on Nvidia GPUs, but on Cerebra's CS-2 systems. The company produces a wafer-sized AI chip that is installed in the CS-2 systems.

Jais and Jais-chat are available on Hugging Face and can be tried out on Arabic-GPT.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

Jais is an open ChatGPT alternative for Arabic

Jais can match ChatGPT in some tasks

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.