Researchers have developed an algorithm that could dramatically reduce the energy consumption of artificial intelligence systems.
Scientists at BitEnergy AI created a method called "Linear-complexity multiplication" (L-Mul) that replaces complex floating-point multiplications in AI models with simpler integer additions.
According to the study "Addition is All You Need for Energy-Efficient Language Models", L-Mul could cut energy use for element-wise floating-point tensor multiplications by up to 95% and for dot products by 80%. The team tested their approach on various language, vision, and reasoning tasks, including language comprehension, structural reasoning, mathematics, and answering common sense questions.
The researchers say L-Mul can be applied directly to the attention mechanism in transformer models with minimal performance loss. The attention mechanism is a core component of modern language models like GPT-4o.
Direct use in attention mechanisms possible
BitEnergy AI sees potential for L-Mul to strengthen academic and economic competitiveness, as well as AI sovereignty. They believe it could enable large organizations to develop custom AI models faster and more cost-effectively.
The team plans to implement L-Mul algorithms at the hardware level and develop programming APIs for high-level model design. Their goal is to train text, symbolic, and multimodal AI models optimized for L-Mul hardware.