Content
summary Summary

Google Deepmind has unveiled AlphaEvolve, a system that uses large language models to automate the discovery and refinement of algorithms. The aim is to develop new algorithms, systematically improve them, and adapt them for practical use.

Ad

AlphaEvolve works by combining two versions of Google's Gemini language models. Gemini Flash is responsible for generating a wide range of program candidates, while Gemini Pro takes a closer look at these proposals and analyzes them in depth.

An evolutionary algorithm then steps in to evaluate the results based on objective metrics like accuracy and computational efficiency. The top-performing variants are selected and further improved in an ongoing loop.

Deepmind describes AlphaEvolve as a "test-time compute agent"—an AI system that actively explores and evaluates new solutions while it runs. Like current reasoning models, AlphaEvolve uses test-time computation to improve its results through active problem-solving.

Ad
Ad

However, while "standard" reasoning models typically apply test-time reasoning to generate step-by-step answers for a single prompt, AlphaEvolve goes further by running an evolutionary loop: it generates, tests, and refines entire algorithms over multiple iterations. This approach builds on the principles of test-time reasoning, but extends them to the broader task of automated algorithm discovery and optimization.

Flowchart: AlphaEvolve process with user input, Prompt Sampler, LLMs, Evaluators Pool, Program database, and Controller Loop.
The AlphaEvolve workflow: A scientist or engineer provides an initial program and configuration. AlphaEvolve then launches an evolutionary loop, where prompt samplers, language model ensembles, and evaluators collaborate to iteratively generate and refine programs, storing the best results in a database. | Image: Google Deepmind

Deepmind says the progress AlphaEvolve achieves can also be distilled back into the underlying language models, compressing what the system learns into more efficient model versions.

Efficiency gains across Google infrastructure

Deepmind says AlphaEvolve is already being used in multiple parts of Google's infrastructure. In one example, the system developed a new heuristic for resource allocation in Google's "Borg" management system, which freed up 0.7 percent more computing power. Deepmind notes that the new solution is robust and easy for humans to read and understand.

AlphaEvolve has also helped optimize the Gemini model itself. By improving how matrix multiplications are broken down, the system reduced training times by one percent. In another case, AlphaEvolve optimized the FlashAttention kernel—a critical GPU component for running large language models—achieving up to a 32.5 percent improvement in benchmark tests.

These types of optimizations are especially difficult, since they are deeply embedded in the software stack and are rarely revised by human engineers, the researchers write.

Recommendation

AlphaEvolve is designed to make this process much more efficient: optimizations that used to take weeks of manual work can now be completed in just a few days of automated experimentation. According to Google Deepmind, this leads to faster adaptation to new hardware and lower long-term development costs for AI systems.

Progress in mathematics

Google Deepmind also tested AlphaEvolve on more than 50 open mathematical problems. The system was able to reconstruct known solutions in about 75 percent of cases and found better solutions than previously known in 20 percent of cases.

One standout example is the kissing number problem, which asks how many spheres of the same size can touch a central sphere without overlapping. For the eleven-dimensional case, AlphaEvolve discovered a new configuration with 593 spheres, setting a new lower bound.

In the area of algorithmic mathematics, AlphaEvolve developed a new way to multiply complex 4x4 matrices using just 48 scalar multiplications—an improvement over the classic Strassen method from 1969. While its predecessor AlphaTensor focused exclusively on matrix multiplication, AlphaEvolve is designed for a much broader range of algorithmic challenges.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

According to Deepmind, these results indicate that AlphaEvolve is capable of more than just replicating existing solutions; it can also discover new approaches in specialized areas of computer science and mathematics.

Where AlphaEvolve fits—and where it doesn't (yet)

AlphaEvolve is best suited for problems that can be expressed algorithmically and evaluated automatically. Deepmind sees application potential in areas like materials research, drug development, and industrial process optimization. The company is also working on a user-friendly interface and an early access program for academic institutions.

There are limits, though. For problems that can't be reliably assessed by automated tests—such as those requiring real-world experiments—AlphaEvolve is less effective. Deepmind says that in the future, language models could provide initial qualitative assessments before more structured evaluation takes place, and the company is already exploring these hybrid approaches.

AlphaEvolve builds on FunSearch, Deepmind's earlier system launched in 2023, but takes things much further. While FunSearch found solutions for mathematical problems like cap sets and bin packing, AlphaEvolve is designed to create complete, practical algorithms for a much broader range of challenges.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • With AlphaEvolve, Google DeepMind has created a system that uses large language models in an automated process to design new algorithms and refine existing ones.
  • The approach merges varied program ideas generated by Gemini Flash with detailed assessments by Gemini Pro, applying an evolutionary algorithm to select and improve the most promising solutions.
  • Already implemented in Google's infrastructure to optimize key computing tasks for AI training, AlphaEvolve was able to recreate known solutions in about 75 percent of over 50 mathematical problems and exceeded previous best results in roughly 20 percent of cases.
Sources
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.