Performance of Distributed TensorFlow: A Multi-Node and Multi-GPU Configuration

This 20-page explores the performance of distributed TensorFlow in a multi-node and multi-GPU configuration, running on an Amazon EC2 cluster.

To get the full document

or fill out the form

The technical study includes performance results for two types of metrics:

In addition, the following values—derived from the metrics above—were measured:

Time normed to the number of computing nodes and workers
Speed of image processing (samples per second) normed to the number of computing nodes and workers

The performance benchmark was carried out employing the Inception architecture as a neural network model and the Camelyon16 data as a training set.