Maxing power while sparing the planet: Arm CEO addresses energy consumption drawback of burgeoning AI

17 Apr, 2024
Newsdesk
Rene Haas, CEO of Cambridge superchip architect Arm, says the company is on a mission to tackle AI’s insatiable energy consumption. While Arm and partners are leading the sensible deployment of Artificial Intelligence in all aspects of technology, Haas concedes that the power that drives progress has to be tamed.
Thumbnail
Rene Haas. Courtesy – Arm.

The NASDAQ-quoted company’s eloquent evangelist says Arm is desperate to find the right balance – harnessing the immense opportunities stemming from AI without harming the planet with the inherent power consumption.

He said: “AI has the potential to exceed all the transformative innovations created in the past century. The benefits to society around healthcare, productivity, education and many other areas will be beyond our imagination.

“To run these complex AI workloads, the amount of compute required in the world’s data centres needs to exponentially scale. However, this insatiable need for compute has exposed a critical challenge: The immense power that data centres require to fuel this groundbreaking technology.”

Haas added: “Today’s data centres already consume lots of power: Globally 460 terawatt-hours (TWh) of electricity are needed annually. That’s equivalent to the entire country of Germany.

“The rise in AI is expected to increase this figure 3x by 2030 – more than the total power consumption of India, the most populated country in the world.

“Future AI models will continue to become larger and smarter, fuelling the need for more compute, which increases demand for power as part of a virtuous cycle.

“Finding ways to reduce the power requirements for these large data centres is paramount to achieving the societal breakthroughs and realising the AI promise. In other words, no electricity, no AI. Companies need to rethink everything to tackle energy efficiency.”

Arm reimagining the future of AI

The power efficiency DNA of Arm – a company whose initial products were designed to run off batteries and sparked the mobile-phone revolution – allows the industry to rethink how chips are built to accommodate these growing demands of AI, Haas explains.

He adds: “In a typical server rack, the compute chip alone can consume more than 50 per cent of the power budget. Engineers are looking for any method to find ways to reduce this number; every watt counts.

“It’s no surprise that in this search, the world’s largest AI hyperscalers have turned to Arm to reduce power. Arm’s latest Neoverse CPU is the most high-performant, power-efficient processor for cloud data centres versus the competition.

“Neoverse offers hyperscalers the flexibility to customise their silicon to optimise for their demanding workloads, all while delivering leading performance and energy efficiency.

“Every watt saved enables more compute. This is why Amazon, Microsoft, Google, and Oracle have now all adopted Arm Neoverse technology to solve both general-purpose compute and CPU-based AI inference and training. Arm Neoverse is on the path to being the de-facto standard across cloud data centres.”

Haas urges observers to reflect on the data endemic in recent announcements:-

• AWS Arm-based Graviton: 25 per cent faster performance for Amazon Sagemaker for AI inference, 30 per cent faster for web applications, 40 per cent faster for databases, and 60 per cent more efficient than competition.

• Google Cloud Arm-based Axion: 50 per cent more performance and 60 per cent better energy efficiency compared to legacy competition architectures, powering CPU-based AI inference and training, YouTube, Google Earth, among others.

• Microsoft Azure Arm-based Cobalt: 40 per cent performance improvement over competition, powering services such as Microsoft Teams and coupling with Maia accelerators to drive Azure’s end-to end AI architecture.

• Oracle Cloud Arm-based Ampere Altra Max: 2.5 times more performance per rack of servers at 2.8 times less power versus traditional competition and being used for generative AI inference models – summarisation, tokenisation of data for LLM training, and batched inference use cases.

Haas says: “It’s evident that Arm Neoverse has enabled vast improvements on performance and power-efficiency for general-purpose compute in the cloud.

“However, customers are now finding the same benefits for accelerated computing. Large-scale AI training requires unique accelerated computing architectures, like the NVIDIA Grace Blackwell platform (GB200), which combines NVIDIA’s Blackwell GPU architecture with the Arm-based Grace CPU.

“This Arm-based computing architecture enables system-level design optimisations that reduce energy consumption by 25x and provide a 30x increase in performance per GPU compared to NVIDIA H100 GPUs using competitive architectures for LLMs.

“These optimisations, which deliver game-changing performance and power savings, are only possible thanks to the unprecedented flexibility for silicon customisation that Arm Neoverse enables.

“As Arm deployments broaden, these companies could save upwards of 15 per cent of the total data centre power. Those enormous savings could then be used to drive additional AI capacity within the same power envelope and not add to the energy problem.

“To put it in perspective, these energy savings could run 2 billion additional ChatGPT queries, power a quarter of all daily web search traffic, light 20 percent of American households, or power a country the size of Costa Rica. That’s a staggering impact on both energy consumption and environmental sustainability.

“At a foundational level, Arm CPUs are powering the AI revolution while benefiting the planet.”