AuthorsN. A. Nordbotten and T. Skeie
EditorsS. Aluru, M. Parashar, R. Badrinath and V. K. Prasanna
TitleA Routing Methodology for Dynamic Fault Tolerance in Meshes and Tori
Afilliation, Communication Systems
StatusPublished
Publication TypeProceedings, refereed
Year of Publication2007
Conference NameInternational Conference on High Performance Computing (HiPC)
Pagination514-527
PublisherSpringer-Verlag
ISBN Number978-3-540-77219-4
Abstract

This paper proposes a fully distributed fault-tolerant routing methodology for tori and meshes. A dynamic fault-model is supported, enabling the network to remain fully operational at all times. Contrary to most previous proposals that support a dynamic fault-model, the methodology is able to tolerate concave fault regions, thereby avoiding disabling healthy nodes in most practical scenarios. The methodology provides high network performance through the use of adaptive routing and provides graceful performance degradation in the presence of faults.

Citation KeySimula.ND.31