AuthorsX. Dong, M. Wen, J. Chai, X. Cai, M. Zhao and C. Zhang
TitleCommunication-Hiding Programming for Clusters with Multi-Coprocessor Nodes
AfilliationScientific Computing
Project(s)Center for Biomedical Computing (SFF)
StatusPublished
Publication TypeJournal Article
Year of Publication2015
JournalConcurrency and Computation: Practice and Experience
Volume27
Issue16
Pagination4172–4185
Date Published05/2015
PublisherJohn Wiley & Sons, Ltd
Keywordshybrid programming, Intel Xeon Phi coprocessor, offload model, SCIF, Tianhe-2
Abstract

Future exascale systems are expected to adopt compute nodes that incorporate many accelerators. To shed some light on the upcoming software challenge, this paper investigates the particular topic of programming clusters that have multiple Xeon Phi coprocessors in each compute node. A new offload approach is considered for intra-node communication, which combines Intel’s APIs of coprocessor offload infrastructure (COI) and symmetric communication interface (SCIF) for achieving low latency. While the conventional pragma-based offload approach allows simpler programming, the COI-SCIF approach has three advantages in (1) lower overhead associated with launching offloaded code, (2) higher data transfer bandwidths, and (3) more advanced asynchrony between computation and data movement. The low-level COI-SCIF approach is also shown to have benefits over the MPI-OpenMP counterpart, which belongs to the symmetric usage mode. Moreover, a hybird programming strategy based on COI-SCIF is presented for joining the computational force of all CPUs and coprocessors, while realizing communication hiding. All the programming approaches are tested by a real-world 3D application, for which the COI-SCIF-based approach shows a performance advantage on Tianhe-2.

Notes

Published online before print.

DOI10.1002/cpe.3507
Citation Key19137

Contact person