AuthorsF. Zahid, E. G. Gran and T. Skeie
TitleRealizing a Self-Adaptive Network Architecture for HPC Clouds
AfilliationCommunication Systems
Project(s)ERAC: Efficient and Robust Architecture for the Big Data Cloud
Publication TypeProceedings, refereed
Year of Publication2016
Conference NameThe International Conference for High Performance Computing, Networking, Storage and Analysis (SC '16) Doctoral Showcase

Clouds offer significant advantages over traditional cluster computing architectures including ease of deployment, rapid elasticity, and an economically attractive pay-as-you-go business model. However, the effectiveness of cloud computing for HPC systems still remains questionable. When clouds are deployed on lossless interconnection networks, challenges related to load balancing, low-overhead virtualization, and performance isolation hinder full potential utilization of the underlying interconnect. In this work, we attack these challenges and propose a novel holistic framework of a self-adaptive IB subnet for HPC clouds. Our solution consists of a feedback control loop that effectively incorporate optimizations based on the multidimensional objective function using current resource configuration and provider-defined policies. We build our system using a bottom-up approach, starting by prototyping solutions tackling individual research challenges associated, and later combining our novel solutions into a working self-adaptive cloud prototype. All our results are demonstrated using state-of-the art industry software to enable easy integration into running systems.

Citation Key24714