The battle of the SuperPods: Nvidia challenges Huawei with Vera Rubin powered DGX cluster that can deliver 28.8 Exaflops with only 576 GPUs
Nvidia’s Rubin DGX SuperPOD delivers 28.8 Exaflops using 576 GPUs, combining compute, memory, and software to rival Huawei’s SuperPOD.

(Image credit: Nvidia)
- Nvidia Rubin DGX SuperPOD delivers 28.8 Exaflops with only 576 GPUs
- Each NVL72 system combines 36 Vera CPUs, 72 Rubin GPUs, and 18 DPUs
- Aggregate NVLink throughput reaches 260TB/s per DGX rack for efficiency
At CES 2026, Nvidia unveiled its next-generation DGX SuperPOD powered by the Rubin platform, a system designed to deliver extreme AI compute in dense, integrated racks.
According to the company, the SuperPOD integrates multiple Vera Rubin NVL72 or NVL8 systems into a single coherent AI engine, supporting large scale workloads with minimal infrastructure complexity.
With liquid cooled modules, high speed interconnects, and unified memory, the system targets institutions seeking maximum AI throughput and reduced latency.
Rubin-based compute architecture
Each DGX Vera Rubin NVL72 system includes 36 Vera CPUs, 72 Rubin GPUs, and 18 BlueField 4 DPUs, delivering a combined FP4 performance of 50 petaflops per system.
Aggregate NVLink throughput reaches 260TB/s per rack, allowing the full memory and compute space to operate as a single coherent AI engine.
The Rubin GPU incorporates a third generation Transformer Engine and hardware accelerated compression, allowing inference and training workloads to process efficiently at scale.
Connectivity is reinforced by Spectrum-6 Ethernet switches, Quantum-X800 InfiniBand, and ConnectX-9 SuperNICs, which support deterministic high speed AI data transfer.
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
Nvidia’s SuperPOD design emphasizes end to end networking performance, ensuring minimal congestion in large AI clusters.
Quantum-X800 InfiniBand delivers low latency and high throughput, while Spectrum-X Ethernet handles east west AI traffic efficiently.
Each DGX rack incorporates 600TB of fast memory, NVMe storage, and integrated AI context memory to support both training and inference pipelines.
The Rubin platform also integrates advanced software orchestration through Nvidia Mission Control, streamlining cluster operations, automated recovery, and infrastructure management for large AI factories.