Menu Content/Inhalt
Tesla V100 Print
April 2018
2NTV100P.PNG

 


NVIDIA® Tesla® V100 is the world’s most advanced data center GPU ever built to accelerate AI, HPC, and graphics. Powered by NVIDIA Volta, the latest GPU architecture, Tesla V100 offers the performance of up to 100 CPUs in a single GPU—enabling data scientists, researchers, and engineers to tackle challenges that were once thought impossible.

Tesla V100 is the flagship product of Tesla data center computing platform for deep learning, HPC, and graphics. The Tesla platform accelerates over 550 HPC applications and every major deep learning framework. It is available everywhere from desktops to servers to cloud services, delivering both dramatic performance gains and cost savings opportunities.


 

TESLA V100 Accelerator Features and Benefits

VOLTA ARCHITECTURE
By pairing CUDA Cores and Tensor Cores within a unified architecture, a single server with Tesla V100 GPUs can replace hundreds of commodity CPU servers for traditional HPC and Deep Learning.
MAXIMUM EFFICIENCY MODE The new maximum efficiency mode allows data centers to achieve up to 40% higher compute capacity per rack within the existing power budget. In this mode, Tesla V100 runs at peak processing efficiency, providing up to 80% of the performance at half the power consumption.
TENSOR CORE Equipped with 640 Tensor Cores, Tesla V100 delivers 125 teraFLOPS of deep learning performance. That’s 12X Tensor FLOPS for DL Training, and 6X Tensor FLOPS for DL Inference when compared to NVIDIA Pascal™ GPUs.
HBM2 With a combination of improved raw bandwidth of 900GB/s and higher DRAM utilization efficiency at 95%, Tesla V100 delivers 1.5X higher memory bandwidth over Pascal GPUs as measured on STREAM. Tesla V100 is now available in a 32GB configuration that doubles the memory of the standard 16GB offering.
NEXT GENERATION NVLINK NVIDIA NVLink in Tesla V100 delivers 2X higher throughput compared to the previous generation. Up to eight Tesla V100 accelerators can be interconnected at up to 300GB/s to unleash the highest application performance possible on a single server.
PROGRAMMABILITY Tesla V100 is architected from the ground up to simplify programmability. Its new independent thread scheduling enables finer-grain synchronization and improves GPU utilization by sharing resources among small jobs.

Specifications:

Specification Description
GPU Architecture
NVIDIA Volta
Tensor Cores
640
Cuda Cores
5120
Single-Precision Performance
14 TeraFLOPS
Double-Precision Performance
7 TeraFLOPS
Tensor Performance 112 TeraFLOPS
GPU Memory
16GB GDDR5
Memory bus width
4096-bit
Memory bandwidth
900 GB/s
Memory clock
877 MHz
System Interface
Full Height/Length PCI Express Gen3
Max Power
250W
Form Factor
111.15 mm, (4.38 inches) (H) × 266.70 mm, (10.5 inches) (L)