
The Challenge of Scaling Enterprise AI
Every business needs to transform using AI, not only to survive, but to thrive in challenging times. But getting there can be a challenge, with most organizations struggling to scale the infrastructure they need to tap into the power of AI at scale. Traditional approaches to AI infrastructure involve slow compute architectures that are siloed by analytics, training and inference workloads, creating complexity, driving up cost, and constraining speed of scale. The enterprise data center is not ready for today’s AI. Enterprises need a new platform that unifies all AI workloads, simplifying infrastructure and accelerating ROI.
The Universal System for Every AI Workload
NVIDIA DGX TM A100 is the universal system for all AI workloads—from analytics to training to inference. DGX A100 sets a new bar for compute density, packing 5 petaFLOPS of AI performance into a 6U form factor, replacing legacy compute infrastructure with a single, unified system. DGX A100 also offers the unprecedented ability to deliver fine-grained allocation of computing power, using the Multi-Instance GPU capability in the NVIDIA A100 Tensor Core GPU, which enables administrators to assign resources that are right-sized for specific workloads. This ensures that the largest and most complex jobs are supported, along with the simplest and smallest. Running the DGX software stack with optimized software from NGC, the combination of dense compute power and complete workload flexibility make DGX A100 an ideal choice for both single node deployments and large scale Slurm and Kubernetes clusters deployed with NVIDIA DeepOps.
System Specifications
GPUs | 8x NVIDIA A100 Tensor Core GPUs |
GPU Memory | 320 GB total |
Performance | 5 petaFLOPS AI 10 petaOPS INT8 |
NVIDIA NVSwitches | 6 |
System Power Usage | 6.5kW max |
CPU | Dual AMD Rome 7742, 128 cores total, 2.25 GHz (base), 3.4 GHz (max boost) |
System Memory | 1TB |
Networking | 8x Single-Port Mellanox ConnectX-6 VPI |
200Gb/s HDR InfiniBand | |
1x Dual-Port Mellanox ConnectX-6 VPI | |
10/25/50/100/200Gb/s Ethernet | |
Storage | OS: 2x 1.92TB M.2 NVME drives, Internal Storage: 15TB (4x 3.84TB) U.2 NVME drives |
Software | Ubuntu Linux OS |
System Weight | 271 lbs (123 kgs) |
Packaged System Weight | 315 lbs (143kgs) |
System Dimensions | Height: 10.4 in (264.0 mm) |
Width: 19.0 in (482.3 mm) MAX | |
Length: 35.3 in (897.1 mm) MAX | |
Operating Temperature Range | 5oC to 30oC (41oF to 86oF) |
Direct Access to NVIDIA DGXperts
NVIDIA DGX A100 is more than a server, it’s a complete hardware and software platform built upon the knowledge gained from the world’s largest DGX proving ground—NVIDIA DGX SATURNV—and backed by thousands of DGXperts at NVIDIA. DGXperts are AI-fluent practitioners who offer prescriptive guidance and design expertise to help fastrack AI transformation. They've built a wealth of know how and experience over the last decade to help maximize the value of your DGX investment. DGXperts help ensure that critical applications get up and running quickly, and stay running smoothly, for dramatically-improved time to insights.