Performance of Distributed TensorFlow: A Multi-Node and Multi-GPU Configuration
This 20-page explores the performance of distributed TensorFlow in a multi-node and multi-GPU configuration, running on an Amazon EC2 cluster.
Why read this?
The technical study includes performance results for two types of metrics:
- Total number of images processed per second
- Average total time of processing on a batch.
In addition, the following values—derived from the metrics above—were measured:
- Time normed to the number of computing nodes and workers
- Speed of image processing (samples per second) normed to the number of computing nodes and workers
The performance benchmark was carried out employing the Inception architecture as a neural network model and the Camelyon16 data as a training set.