Permutation Code: Optimal Exact-Repair of a Single Failed Node in MDS Code Based Distributed Storage Systems

IEEE International Symposium on Information Theory Proceedings, Saint Petersburg, Russia, August 2011 |

Published by Institute of Electrical and Electronics Engineers

Publication

We consider exact repair of failed nodes in maximum distance separable (MDS) code based distributed storage systems. It is well known that an (n, k) MDS code can tolerate failure (erasure) of up to n – k storage disks, when the code is used to store k information elements over n distributed storage disks. The focus of this paper is optimal recovery, in terms of repair bandwidth – the amount of data to be downloaded to repair a failed node – for a single failed node. When a single node fails, it has been previously shown by Dimakis et. al. that the amount of repair bandwidth is at least equation units, when each storage disk stores ℒ units of data. The achievability of this lower bound of equation units, for arbitrary values of (n, k); has been shown previously using asymptotic code constructions based on asymptotic interference alignment. However, the existence of finite codes satisfying this lower bound has been shown only for specific regimes of (n, k) and their existence for arbitrary values of (n, k) remained open. In this paper, we provide the first known construction of a finite code for arbitrary (n, k), which can repair a single failed systematic node by downloading exactly equation units of data. The code that we construct is based on permutation matrices and hence termed the Permutation Code.