Fugaku-LLM is an open source language model optimized for Japanese

May 13, 2024

RIKEN

A team of Japanese researchers is using Fuitsu's Fugaku supercomputer to train Fugaku-LLM, a large language model specifically adapted to the Japanese language and culture.

Large language models, such as OpenAI's GPT-4, are primarily developed by US companies and optimized for English. Existing models often struggle with the intricacies of Japanese language and culture, such as confusing rare characters or not applying cultural communication norms appropriately, the researchers say.

To address this, a team of researchers from Tokyo Institute of Technology, Tohoku University, Fujitsu, RIKEN, Nagoya University, and the companies CyberAgent and Kotoba Technologies is developing Fugaku-LLM. The model is designed to conduct natural dialogues that consider Japanese polite language and other features of the language.

A unique aspect of Fugaku-LLM is that about 60 percent of its training data is in Japanese, with the rest in English, as well as mathematical and code data. Compared to models based on existing English models that are continually trained on Japanese, Fugaku-LLM has learned much of its information directly in Japanese, according to the research team.

The model was trained on the Japanese supercomputer Fugaku, which uses CPUs developed by Fujitsu instead of GPUs. With 13,824 processing nodes and 380 billion tokens used for training, Fugaku-LLM has 13 billion parameters.

The research team claims that Fugaku-LLM is the best open model developed in Japan with its own data, achieving a benchmark score of 9.18 on the Japanese MT bench for humanities and social sciences tasks provided by Stability AI.

The language models and source code of Fugaku-LLM are available on Hugging Face, Github, and the Fujitsu Research Portal for research and commercial purposes, as long as users comply with the Apache 2.0 license.

The Japanese government and companies such as NEC, Fujitsu, and SoftBank are investing hundreds of millions of dollars in developing their own language models. They want to promote research in their own country with more culturally sensitive models and become less dependent on large U.S. technology companies.

Whether that works out remains to be seen. OpenAI recently released a Japanese-optimized version of GPT-4, which is already being used in projects with the Japanese government.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

Fugaku-LLM is an open source language model optimized for Japanese

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.