|Authors||Z. Vrba, P. Beskow, P. Halvorsen and C. Griwodz|
|Title||Kahn Process Networks Are a Flexible Alternative to MapReduce|
|Afilliation||Communication Systems, Communication Systems|
|Publication Type||Proceedings, refereed|
|Year of Publication||2009|
|Conference Name||Proceedings of 11th IEEE International Conference on High Performance Computing and Communications (HPCC)|
|Publisher||IEEE Computer Society|
Experience has shown that development using shared-memory concurrency, the prevalent parallel programming paradigm today, is hard and synchronization primitives nonintuitive because they are low-level and inherently nondeterministic. To help developers, we propose Kahn process networks, which are based on message-passing and shared-nothing model, as a simple and flexible tool for modeling parallel applications. We argue that they are more flexbile than MapReduce, which is widely recognized for its efficiency and simplicity. Nevertheless, Kahn process networks are equally intuitive to use, and, indeed, MapReduce is implementable as a Kahn process network. Our presented benchmarks (word count and k-means) show that a Kahn process network framework permits alternative implementations that bring significant performance advantages: the two programs run by a factor of up to \sim 2.8 (word-count) and \sim 1.8 (k-means) faster than their implementations for Phoenix, which is a MapReduce framework specifically optimized for executing on multicore machines.