Google Transitions Internal Workloads to Arm Architecture with AI Assistance

Google has announced that it has successfully transitioned approximately 30,000 of its production applications to the Arm architecture. The tech giant aims to fully convert its workloads to run on both its proprietary Axion silicon and traditional x86 processors. This significant migration effort was detailed in a preprint paper published last week, titled “Instruction Set Migration at Warehouse Scale,” alongside a blog post.

Among the applications already utilizing both architectures are popular services like YouTube, Gmail, and BigQuery, along with an additional 30,000 applications. The documents outline the migration process undertaken by the company, led by engineering fellow Parthasarathy Ranganathan and developer relations engineer Wolff Dobson. They noted that the project began with a focus on addressing architectural variances such as floating point drift, concurrency, and platform-specific performance issues.

Ranganathan and Dobson explained that the initial phase involved migrating key applications, including F1, Spanner, and Bigtable, employing standard software development practices that included regular meetings and dedicated engineers. During this initial phase, while some expected issues arose, they found fewer complications than anticipated, largely due to advancements in modern compilers and tools such as sanitizers, which helped mitigate potential surprises.

The engineers concentrated on various tasks throughout the migration of the 30,000 applications, leveraging existing automation tools. Additionally, they developed a new AI tool named “CogniPort” to address challenges that their existing tools could not resolve. “CogniPort” automatically intervenes when build and test errors occur, attempting to rectify them. The team reported that CogniPort successfully resolved issues approximately 30 percent of the time under specific conditions, particularly excelling in fixing tests, platform-specific conditionals, and data representation problems.

Although this success rate may seem modest, Google has a substantial workload ahead, with an estimated 70,000 more applications to migrate. The company”s ultimate goal is to enable its renowned Borg cluster manager, which serves as the foundation for Kubernetes, to efficiently allocate internal workloads across Arm servers. This transition is anticipated to yield cost savings, as Google claims that its Axion-powered machines provide up to 65 percent improved price-performance compared to x86 instances, with energy efficiency gains of up to 60 percent.

Given the scale of this migration and the performance benefits outlined, it appears that Google may reduce its reliance on x86 processors in the foreseeable future.