AuthorsB. Bogdanski, B. D. Johnsen, S. Reinemo and F. O. Sem-Jacobsen
EditorsH. Shen, Y. Sang, Y. Li, D. Qian and A. Y. Zomaya
TitleDiscovery and Routing of Degraded Fat-Trees
AfilliationCommunication Systems, Communication Systems
Publication TypeProceedings, refereed
Year of Publication2012
Conference Name2012 13th International Conference on Parallel and Distributed Computing, Applications and Technologies
Date PublishedDecember
PublisherIEEE Computer Society
Place PublishedLos Alamitos

The fat-tree topology has become a popular choice for InfiniBand enterprise systems due to its deadlock freedom, fault-tolerance and full bisection bandwidth. In the HPC domain, InfiniBand fabric is used in almost 42% of the systems on the latest Top 500 list, and many of those systems are based on the fat-tree topology. Despite the popularity of the fat-tree topology, little research has been done to compare the behavior of InfiniBand routing algorithms on degraded fat-tree topologies. In this paper, we identify the weaknesses of the current fat-tree routing and propose enhancements that liberalize the restrictions imposed on the routed fabric. Furthermore, we present a thorough analysis of non-proprietary routing algorithms that are implemented in the InfiniBand Open Subnet Manager. Our results show that even though the performance of a fat-tree routed network deteriorates predictably with the number of failed links, fat-tree routing algorithm is still the best choice for severely degraded fat-tree fabrics.

Citation KeySimula.simula.1554