The Google cloud is growing with a number of new data centers. Google doesn’t splash out and speaks of up to 26 exaflops of AI performance per new system – that corresponds to 26 trillion (26,000,000,000,000,000,000) operations per second. Customers can rent the computing power via future A3 instances, for example to train large language models.

Im blog post, Google talks about A3 GPU supercomputersthat the company is building all over the world. Each system uses the same hardware components that are scaled in different quantities.

Nvidia’s H100 GPUs and Intel’s fourth-generation Xeon Scalable processors, also known as Sapphire Rapids, are used. The systems are apparently based on Nvidia’s DGX100, so a cluster should consist of eight H100 accelerators and two Xeon SP CPUs. Nvidia’s built-in NV links and associated NV switches handle communication between the GPUs, with Google using its own software stack. Custom network processors (Infrastructure Processing Units, IPUs) developed together with Intel relieve the Xeon CPUs.

One Google spokeswoman confirmed to HPC Wirethat tens of thousands of H100 GPUs will be deployed in the largest A3 data centers: “For our largest customers, we can build A3 supercomputers with up to 26,000 GPUs in a single cluster and are working to build multiple clusters in our largest regions. ” However, not every system gets as many GPUs.

On this scale, Google can compete with the currently fastest supercomputers in the world. Frontier, as the leader of the current Top500 list, manages more than one FP64 exaflop with thousands of AMD Epyc processors and Radeon Instinct MI250X GPUs.

In this data format, 26,000 H100 GPUs would achieve about 780 petaflops (0.78 exaflops) at best – the real performance should be rather lower over such a large network. Added to this would be the computing power of 6500 Intel CPUs (with two processors per cluster). The above 26 exaflops apply to simpler AI formats like Tensorfloat 32 (TF32) or FP16.

According to the current status, a fully equipped A3 supercomputer would comfortably occupy second place in the Top500 list. As a private company, however, Google will probably not carry out a corresponding Linpack benchmark run in order to end up in the list.


(mma)

To home page

California18

Welcome to California18, your number one source for Breaking News from the World. We’re dedicated to giving you the very best of News.

Leave a Reply