AuthorsE. G. Gran
TitleCongestion Management in Lossless Interconnection Networks
Afilliation, Communication Systems
StatusPublished
Publication TypePhD Thesis
Year of Publication2014
Date PublishedMarch
PublisherFaculty of Mathematics and Natural Sciences, University of Oslo
Thesis Typephd
Abstract

In supercomputers and modern data center clusters lossless interconnection networks are frequently used to achieve high throughput and low latency. It has been known for three decades however, that congestion and congestion spreading in such networks can lead to severe performance degradation if no countermeasure is taken. Nevertheless, for a long time, the challenges related to network-wide congestion in lossless interconnection networks received little attention. A combination of tuning and tailoring of network characteristics for a given application, together with overprovisioning of network resources, kept congestion and congestion spreading from occurring in practice. To be able to dynamically manage congestion was then not really needed. During the last decade, however, we have seen a renewed interest in congestion management for lossless interconnection networks. The use of virtualization together with an increased focus on cost-efficient green computing have spawned a desire to operate networks with dynamic and unpredictable traffic patterns closer to saturation. As such, proper congestion management is needed. In this thesis, we study congestion management in lossless interconnection networks in general, while giving special attention to the congestion control mechanism specified for InfiniBand, currently one of the most popular interconnection network standards. The contributions of the thesis include guidelines on how to implement congestion detection in switches facilitating injection throttling at the source nodes to avoid unfair treatment of contributors to congestion; an exploration of the rich InfiniBand congestion control parameter space and the corresponding influence on the performance of the congestion control mechanism; a study of the scope of an injection throttling based congestion management mechanism, like the one specified for InfiniBand; an abstract classification scheme for congestion trees of varying degree of dynamics; and finally, two novel congestion management mechanisms for input buffered switches and switches utilizing virtual output queuing, respectively, to overcome the weaknesses of current congestion management mechanisms based on injection throttling or hot-flow dynamic isolation.