Distributed storage is the scalable and economically viable technology for storing our collective memory. It is unknown how to optimally design distributed storage systems that are both robust against arbitrary failures, and secure against determined attacks. The project addresses these issues through a theoretical approach guided by practical concerns.
Due to the vast amounts of data being generated and accessed worldwide, the demand for large-scale data storage has increased dramatically during recent years. Data centers typically employ cheap commodity hardware connected in a distributed storage system in order to scale massively at low cost. Examples of existing distributed storage systems are OceanStore and Google File System (GFS). The cheap components suffer from frequent failures, and software glitches, machine reboots, local power failures and maintenance operations also contribute to devices being rendered unavailable from time to time. Thus, resilience to failures of individual components is an essential property of a distributed storage system. Traditionally, this resilience is provided by replication across multiple machines. For instance, GFS and the Hadoop Distributed File System store three copies of all data by default. On a massive scale of operation, storing multiple copies of all files is expensive and inefficient, and hence data centers are increasingly using more sophisticated coding-theoretic techniques.
The Research Council of Norway (FRINATEK)