|
April 2018 |
|
In the new era of AI and intelligent machines, deep learning is shaping our world like no
other computing model in history. GPUs powered by the revolutionary NVIDIA Pascal™
architecture provide the computational engine for the new era of artificial intelligence,
enabling amazing user experiences by accelerating deep learning applications at scale.
The NVIDIA Tesla P40 is purpose-built to deliver maximum throughput for deep learning
deployment. With 47 TOPS (Tera-Operations Per Second) of inference performance and
INT8 operations per GPU, a single server with 8 Tesla P40s delivers the performance of
over 140 CPU servers.
As models increase in accuracy and complexity, CPUs are no longer capable of delivering
interactive user experience. The Tesla P40 delivers over 30X lower latency than a CPU for
real-time responsiveness in even the most complex models.
TESLA P40 Accelerator Features and Benefits
140X HIGHER THROUGHPUT TO KEEP UP WITH EXPLODING DATA
|
The Tesla P40 is powered by the new Pascal architecture and delivers over 47 TOPS of deep learning inference performance. A single server with 8 Tesla P40s can replace up to 140 CPU-only servers for deep learning workloads, resulting in substantially higher throughput with lower acquisition cost. |
| SIMPLIFIED OPERATIONS WITH A SINGLE TRAINING AND INFERENCE PLATFORM |
Today, deep learning models are trained on GPU servers but deployed in CPU servers for inference. The Tesla P40 offers a drastically simplified workflow, so organizations can use the same servers to iterate and deploy. |
| REAL-TIME INFERENCE |
The Tesla P40 delivers up to 30X faster inference performance with INT8 operations for real-time responsiveness for even the most complex deep learning models. |
| FASTER DEPLOYMENT WITH NVIDIA DEEP LEARNING SDK |
TensorRT included with NVIDIA Deep Learning SDK and Deep Stream SDK help customers seamlessly leverage inference capabilities like the new INT8 operations and video trans-coding. |
Specifications:
| Specification |
Description |
GPU Architecture
|
NVIDIA Pascal™
|
Cuda Cores
|
3840 |
Single-Precision Performance
|
12 TeraFLOPS |
| Integer Operations (INT8) |
47 TOPS (Tera- Operations per Second) |
GPU Memory
|
24GB GDDR5
|
Memory bus width
|
384-bit
|
Memory bandwidth
|
346 GB/s
|
Memory clock
|
Performance:3615 MHz
Idle: 405 MHz |
System Interface
|
Full Height/Length PCI Express Gen3
|
Max Power
|
250W
|
Form Factor
|
111.15 mm, (4.38 inches) (H) × 266.70 mm, (10.5 inches) (L)
|
|