The Indian Exascale Super Computer using the indigenous AUM microprocessors is slated to become the world's second fastest Supercomputer in the world surpassing Japan's Fugaku

India is developing an ARM-based high-performance computing (HPC) processor to power its first exascale supercomputer, which is expected to be ready this year. The processor, called AUM, developed by the Centre for Development of Advanced Computing (C-DAC), an autonomous scientific body under the Ministry of Electronics and Information Technology (MeitY).

AUM is based on a 5-nanometer node and will have 96 cores, which C-DAC says will put it ahead of the Fujitsu A64FX processor that powers the Fugaku, the second fastest supercomputer in the world. C-DAC claims AUM will offer 4.6 teraflops per socket of compute power, which is more than the 2.7 teraflops per socket mustered by Fugaku.

At the moment, the most powerful ARM processor on the planet is the 48-core A64FX processor from Fujitsu, which was created as the heavily vectored compute engine for the “Fugaku” supercomputer at RIKEN Lab in Japan. Nvidia is getting ready to ship its 72-core “Grace” Arm CPU, which has yet to be given a product name but CG100 seems logical. And both are going to be getting some intense competition from a newcomer based in India.

There is a non-zero chance that the AUM HPC processor designed by the Centre for Development of Advanced Computing (C-DAC) could outperform both A64FX and Grace and even give Amazon’s 64-core Graviton3 chip and Ampere Computing’s Altra, Altra Max, and AmpereOne processors a run for the money on more generic workloads.

C-DAC said that it is seeking industry collaboration for design of system on chips (SOC), server designs and to deploy and market solutions based on the AUM processor.

The development of AUM is part of India's National Supercomputing Mission, which was announced in 2015 and aims to move the country from assembly to manufacturing to “design and manufacturing” of supercomputers, including HPC network, software stack, HPC processor and liquid cooling technologies.

C-DAC is entrusted with the task of building a network of 24 supercomputers with a combined compute power of more than 64 petaflops. More than half of these have already been deployed at multiple technical institutes in India, including IITs, Indian Institute of Science (IISCs), and Indian Institute of Science Education and Research (IISER) Pune.

To create the AUM processor, the techies at C-DAC studied the A64FX processor and Fugaku system at RIKEN as well as its predecessor Sparc64-VIIIfx processor and K supercomputer, and saw what we all see in the HPCG benchmark data since it first was released. And that is: Getting a better ratio of memory bandwidth to floating point operations per second actually drives real-world application performance. In fact, the K system has a bytes/flops ratio of 0.5 and delivered 5.2 percent of peak HPL performance on the HPCG test, compared to a 0.38 bytes/flops ratio and a 3 percent of peak rating for Fugaku. (In other words, performance moved in the wrong direction with the generational leap at RIKEN.) So C-DAC decided to try to push up the memory bandwidth per flops ratio to above 0.5. In addition, C-DAC wanted to stay away from GPU and other accelerators and use relatively small vectors that are easier to optimize as well as provide a mix of HBM and DDR main memories and plenty of PCI-Express I/O lanes with CXL coherency support out to accelerators.

C-DAC has also developed a server named Rudra and a high-speed interconnect called Trinetra, which connects the supercomputers.

Mint reported last year that supercomputers that are being deployed in India are not among the most powerful in the world. According to the Top 500 data, only five Indian supercomputers qualify in their list and none of them are in the list of top 100. In comparison, China and the US account for nearly two-thirds of supercomputers in the Top 500.

The computing power of a supercomputer is measured in floating-point operations per second or FLOPS. One petaflops is equal to 1,000,000,000,000,000 (one quadrillion) FLOPS, or one thousand teraflops.

World’s fastest supercomputer Frontier, which is located in the US, offers a peak speed of 1685 petaflops, while Japan’s Fugaku has speed of 537 petaflops. Param Siddhi, which offers a peak performance of 5.27 petaflops, is currently the fastest supercomputer in India.

That said, developing an exascale supercomputer driven by an indigenous processor will be a major achievement for India and will also boost research and development efforts in the country.