AuthorsM. Sourouri, T. Gillberg, S. Baden and X. Cai
TitleEffective Multi-GPU Communication Using Multiple CUDA Streams and Threads
AfilliationScientific Computing, Scientific Computing, ,
Project(s)Center for Biomedical Computing (SFF)
StatusPublished
Publication TypeProceedings, refereed
Year of Publication2014
Conference Name20th International Conference on Parallel and Distributed Systems (ICPADS 2014)
Pagination981-986
PublisherIEEE
Abstract

In the context of multiple GPUs that share the same PCIe bus, we propose a new communication scheme that leads to a more effective overlap of communication and computation. Multiple CUDA streams and OpenMP threads are adopted so that data can simultaneously be sent and received. A representative 3D stencil example is used to demonstrate the effectiveness of our scheme. We compare the performance of our new scheme with an MPI-based state-of-the-art scheme. Results show that our approach outperforms the state-of-the-art scheme, being up to 1.85× faster. However, our performance results also indicate that the current underlying PCIe bus architecture needs improvements to handle the future scenario of many GPUs per node.

DOI10.1109/PADSW.2014.7097919
Citation Key19138

Contact person