Authors | F. Zahid, E. G. Gran and T. Skeie |
Title | Realizing a Self-Adaptive Network Architecture for HPC Clouds |
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Status | Published |
Publication Type | Proceedings, refereed |
Year of Publication | 2016 |
Conference Name | The International Conference for High Performance Computing, Networking, Storage and Analysis (SC '16) Doctoral Showcase |
Abstract | Clouds offer significant advantages over traditional cluster computing architectures including ease of deployment, rapid elasticity, and an economically attractive pay-as-you-go business model. However, the effectiveness of cloud computing for HPC systems still remains questionable. When clouds are deployed on lossless interconnection networks, challenges related to load balancing, low-overhead virtualization, and performance isolation hinder full potential utilization of the underlying interconnect. In this work, we attack these challenges and propose a novel holistic framework of a self-adaptive IB subnet for HPC clouds. Our solution consists of a feedback control loop that effectively incorporate optimizations based on the multidimensional objective function using current resource configuration and provider-defined policies. We build our system using a bottom-up approach, starting by prototyping solutions tackling individual research challenges associated, and later combining our novel solutions into a working self-adaptive cloud prototype. All our results are demonstrated using state-of-the art industry software to enable easy integration into running systems. |
Citation Key | 24714 |