|Authors||E. Tasoulas, E. G. Gran, B. D. Johnsen and T. Skeie|
|Title||A Novel Query Caching Scheme for Dynamic InfiniBand Subnets|
|Afilliation||Communication Systems, Communication Systems|
|Publication Type||Proceedings, refereed|
|Year of Publication||2015|
|Conference Name||15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)|
In large InfiniBand subnets the Subnet Manager (SM) is a potential bottleneck. When an InfiniBand subnet grows in size, the number of paths between hosts increases polynomially and the SM may not be able to serve the network in a timely manner when many concurrent path resolution requests are received. This scalability challenge is further amplified in a dynamic virtualized cloud environment. When a Virtual Machine (VM) with InfiniBand interconnect live migrates, the VM addresses change. These address changes result in additional load to the SM as communicating peers send Subnet Administration (SA) path record queries to the SM to resolve new path characteristics.
In this paper we benchmark OpenSM to empirically demonstrate the SM scalability problems. Then we show that our novel SA Path Record Query caching scheme significantly reduces the load towards the SM. In particular, we show by using the Reliable Datagram Socket protocol that only a single initial SA path query is needed per communicating peer, independent of any subsequent (re)connection attempts.