NVIDIA has just unveiled the full extent of specifications for their biggest SOC design to date, the Xavier SOC. Built for AI super computing, the Xavier SOC is an engineering marvel that packs 9.0 billion transistors, making it one of the densest packed SOC design to date.
NVIDIA's Xavier SOC Is The Biggest and Most Complex SOC Design To Date - Features Volta GPU, Custom CPU and 9 Billion Transistors
Back at GTC Europe 2016, NVIDIA first introduced their Xavier SOC which was still in development. Fast forward to CES 2018, NVIDIA is taking off the wraps from the complete SOC specifications and it is a very different chip to what NVIDIA announced back in 2016.
The design has changed a lot during the time frame, today, NVIDIA mentioned that the Xavier SOC is built on a TSMC 12nm process node and houses 9 billion transistors crammed underneath a 350 mm2 die area. This is a big change from the 16nmFF based 7 billion transistor design we were last showcased. My believe is that most of these changes came when NVIDIA announced and launched Volta on the 12nm process earlier in 2017. Moving the same design enhancements to an SOC saw big changes as Volta is a key part of the Xavier chip.
In terms of specifications, the Xavier SOC packs the NVIDIA custom built Carmel AMR64 CPU which houses 8 cores in a 10-wide superscalar architecture. There are features such for functional safety, dual execution, parity & ECC available on the CPU itself. Within the die is also a Volta GPU which packs 512 CUDA cores. The Volta GPU is able to perform FP32, FP16 and INT8 calculations as per need be in a multi-precision environment. The chip delivers 1.3 TFLOPs of peak FP32 performance and 20 Tensor core TOPs. The chip delivers all of this compute at just 20W, NVIDIA states that they can even reach a theoretical output of 30 TOPs with a 30W TDP so they are obviously referring to higher clocks out of the increased power envelope.
With more than 9 billion transistors, Xavier is the most complex system on a chip ever created, representing the work of more than 2,000 NVIDIA engineers over a four-year period, and an investment of $2 billion in research and development.
It’s built around a custom 8-core CPU, a new 512-core Volta GPU, a new deep learning accelerator, new computer vision accelerators and new 8K HDR video processors. And with our unified architecture, all previous NVIDIA DRIVE software development carries over and runs.
While the technical details are complex, the story is simple: DRIVE Xavier puts more processing power to work using less energy, delivering 30 trillion operations per second while consuming just 30 watts. It’s 15 times more energy efficient than our previous generation architecture. via NVIDIA
Based on the tensor core compute output, it seems like NVIDIA has the same Tensor cores on board the SOC as their flagship Tesla V100 Volta GPU which blazes through deep learning algorithms.
NVIDIA Drive PX Generation Comparison:
Product Name | NVIDIA Drive PX | NVIDIA Drive PX 2 | NVIDIA Drive Xavier | NVIDIA Drive Pegasus | NVIDIA Drive AGX Orin |
---|---|---|---|---|---|
SOC Name | Tegra X1 | Parker | Xavier | Xavier | Orin |
Process Technology | 20nm SOC | 16nm FinFET | 12nm FinFET | 12nm FinFET | TBA |
SOC Transistors | 2 Billion (Tegra X1) | N/A | 7 Billion (Xavier) | 7 Billion (Xavier) | 17 Billion (Orin) |
GPU Architecture | Maxwell (256 Core) | Pascal (256 Core) | Volta (512 Core) | Volta (512 Core) | Ampere? |
CPU | 16 Core ARM CPU | 12 Core ARM CPU | 8 Core ARM CPU | 16 Core ARM CPU | 12 Core ARM CPU |
CPU Architecture | 8x Cortex A57 8x Cortex A53 | 4x Denver 8x Cortex A57 | Carmel ARM64 8 Core CPU (8 MB L2 + 4 MB L3) | Carmel ARM64 8 Core CPU (8 MB L2 + 4 MB L3) | ARM Herclues Cores |
Compute DLTOPs | N/A | 20 DLTOPs | 30 TOPs | 320 TOPs | 200 TOPs |
Total Chips | 2 x Tegra X1 | 2 x Tegra X2 2 x Pascal MXM GPUs | 1 x Xavier | 2 x Volta 2 x Turing | 1 x Ampere |
System Memory | LPDDR4 | 8 GB LPDDR4 (50+ GB/s) | 16 GB 256-bit LPDDR4 | LPDDR4 + GDDR6 | N/A |
Graphics Memory | N/A | 4 GB GDDR5 (80+ GB/s) | 137 GB/s | 1 TB/s | 200 GB/s |
TDP | 20W | 80W | 30W | 500W | TBA |
The ISP on board the chip allows for 1.5 GPIX/s, native full range HDR and tile based rendering. The DLA engine can perform 5 TFLOPs FP16 and 10 TOPs INT8, the PVA engine allows for 1.6 TOPs, stereo disparity, optical flow, image processing. The networking is handled by the 16 CSI that can drive 109 Gbps with support for 1 gE and 10 gE. Finally, the video processor on board the chip is able to do 1.2 GPIX/s encode and 1.8 GPIX/s decode.
The NVIDIA Drive Xavier will be part of the Drive AI family from NVIDIA and is currently in sampling phase. NVIDIA has announced the mass production is expected in late 2018 but the Xavier SOC will eventually become part of the bigger Drive Pegasus platform which includes two Volta GPUs and two next generation discrete GPUs on board a single platform for self driving cars.
NVIDIA Drive PX 'Pegasus' with Next Generation dGPUs Specification
Wccftech | DRIVE PX 'Pegasus' | DRIVE PX 2 |
---|---|---|
SoC | 2x Xavier | 2x Tegra X2 |
Discrete GPU | 2x Next Generation Unknown | 2x Pascal |
CPU Cores | 16x NVIDIA Unknown ARM | 4x NVIDIA Denver & 8x ARM Cortex A57 |
GPU Cores | 2x Volta iGPU & 2x Post Volta dGPUs | 2x Pascal iGPU & 2x GP104 |
DL TOPS | 320 TOPS | 24 TOPS |
TFLOPS | N/A | 8 TFLOPs |
TDP | 500W | 250W |
DRIVE PX Pegasus is powered by four high-performance AI processors. It couples two of Xavier system-on-a-chip processors — featuring an embedded GPU based on the Volta architecture — with two next-generation discrete GPUs with hardware created for accelerating deep learning and computer vision algorithms. The system will provide the enormous computational capability for fully autonomous vehicles in a computer the size of a license plate, drastically reducing energy consumption and cost.
Pegasus is designed for ASIL D certification — the industry’s highest safety level — with automotive inputs/outputs, including CAN (controller area network), Flexray, 16 dedicated high-speed sensor inputs for camera, radar, lidar and ultrasonics, plus multiple 10Gbit Ethernet connectors. Its combined memory bandwidth exceeds 1 terabyte per second. via NVIDIA
NVIDIA has said that they will be cooperating with Volkswagen in developing self driving cars such as the VW Buzz.
Xavier arrives just as players across the global auto industry are preparing to launch a wave of next-gen vehicles with unprecedented capabilities.
“We believe that AI can eventually drive the cost per mile of autonomous vehicles to essentially the same level, if not below, that of owned cars — self-driven cars,” Huang said. “So when that happens, it’s possible, we believe, that AV could revolutionize mobility services.”
Emphasizing the importance of AI to the auto industry, Volkswagen CEO Herbert Diess joined Jensen on stage to discuss how AI and deep learning will shape the development of a new generation of VW vehicles. via NVIDIA
Technologies such as the Drive IX and Drive AR will also be an integral part of the NVIDIA Drive platform which aims to revolutionize the automobile industry.