|Authors||F. Zahid, E. G. Gran and T. Skeie|
|Title||Realizing a Self-Adaptive Network Architecture for HPC Clouds|
|Project(s)||ERAC: Efficient and Robust Architecture for the Big Data Cloud|
|Publication Type||Proceedings, refereed|
|Year of Publication||2016|
|Conference Name||The International Conference for High Performance Computing, Networking, Storage and Analysis (SC '16) Doctoral Showcase|
Clouds offer significant advantages over traditional cluster computing architectures including ease of deployment, rapid elasticity, and an economically attractive pay-as-you-go business model. However, the effectiveness of cloud computing for HPC systems still remains questionable. When clouds are deployed on lossless interconnection networks, challenges related to load balancing, low-overhead virtualization, and performance isolation hinder full potential utilization of the underlying interconnect. In this work, we attack these challenges and propose a novel holistic framework of a self-adaptive IB subnet for HPC clouds. Our solution consists of a feedback control loop that effectively incorporate optimizations based on the multidimensional objective function using current resource configuration and provider-defined policies. We build our system using a bottom-up approach, starting by prototyping solutions tackling individual research challenges associated, and later combining our novel solutions into a working self-adaptive cloud prototype. All our results are demonstrated using state-of-the art industry software to enable easy integration into running systems.