Researchers at UC Berkeley have unveiled a new approach to load balancing algorithms that leverages artificial intelligence. In a preprint titled “Barbarians at the Gate: How AI is Upending Systems Research,” a team of 17 researchers outlines their findings on how AI models can be utilized to discover and refine algorithms.
The researchers utilized OpenEvolve, an open-source version of Google DeepMind”s AlphaEvolve, to enhance a load balancing algorithm, achieving a performance that is significantly superior to human-designed alternatives. They reported a fivefold increase in efficiency for an Expert Parallelism Load Balancer (EPLB) algorithm, which is crucial for large language models by directing tokens to specialized modules, thereby optimizing the processing of parameters.
The authors assert that the concept of AI-Driven Research for Systems (ADRS) has the potential to revolutionize systems research. “As AI assumes a central role in algorithm design, we argue that human researchers will increasingly focus on problem formulation and strategic guidance,” they stated. Their findings emphasize both the transformative potential of AI and the urgent necessity for adapting research methodologies in this new era.
In May, Google highlighted AlphaEvolve as an “evolutionary coding agent” that enhances the efficiency of data center orchestration, optimizes matrix multiplication in its Tensor Processing Unit hardware, and improves the FlashAttention kernel in Transformer-based AI models. Additionally, a recent paper from Google DeepMind described an autonomous method for discovering reinforcement learning rules through agent interactions with diverse environments.
The UC Berkeley team demonstrated the effectiveness of AI-driven optimization by using OpenEvolve to enhance load balancing across GPUs involved in large language model inference. They began with an open-source EPLB implementation from DeepSeek, which they noted was inefficient due to its reliance on Python and linear search methods. This initial version averaged approximately 540 milliseconds for rebalancing tasks.
They compared it with a faster, non-public EPLB implementation from an undisclosed laboratory, which rebalanced in 19.6 milliseconds. Utilizing OpenEvolve, which combined 80 percent Gemini 2.5 Flash and 20 percent Gemini 2.5 Flash Lite at a minimal cost, the researchers developed a more efficient method, employing vectorized tensor operations and a zig-zag partitioning scheme that achieved a remarkable runtime of just 3.7 milliseconds. This represents a fivefold improvement over the unidentified reference implementation and a 146-fold improvement over DeepSeek”s algorithm.
Another example presented in the study indicated that OpenEvolve facilitated a threefold increase in the speed of relational analytics when SQL queries invoked large language model inference operations on individual rows.
Co-author Audrey Cheng, a PhD candidate at UC Berkeley, addressed the nature of OpenEvolve”s “reasoning.” In correspondence with The Register, she stated, “These are hard questions to answer definitively as they involve whether large language models are genuinely “thinking” or simply executing advanced probability calculations.” Cheng noted that these models benefit from training on a vast corpus of literature, providing advantages in discovering innovative applications of ideas from various domains.
Cheng believes that the implications of ADRS are substantial. “We focus on systems performance problems because AI can already outperform human expert solutions,” she explained. She anticipates widespread adoption of ADRS for performance tuning in companies operating large systems. Furthermore, once researchers establish robust evaluation and validation frameworks for other challenges like security and fault tolerance, Cheng expects ADRS to generate even more innovative solutions.
“The current bottleneck is having a robust evaluation and validation framework,” she concluded. “Once that is established, ADRS could potentially apply to a wide range of system problems and beyond the field of computer science.”
