Numerous works have developed maximum-distance separable (MDS) storage codes that minimize the total communication cost required to repair a single coded symbol after an erasure, referred to as repair bandwidth (BW). A fundamental property of these codes is the fact that they are vector-linear: each coded symbol is treated as a collection of smaller sized sub-symbols. Communicating sub-symbols, instead of entire coded symbols, is key to achieving low repair BW. In sharp contrast, off-the-shelf codes currently used in practical storage systems suffer from worst-case repair BW costs. Repairing classic codes efficiently has been an interesting open problem. The property of classic MDS codes that impedes nontrivial repair is that they are scalar-linear, i.e., symbols are treated as indivisible chunks. In this work, we present a simple framework that treats scalar codes, defined over extension fields, as vector-linear, enabling them to admit nontrivial repair schemes. The typical repair design problem is viewed as a problem of designing algebraic repair field elements. For the case of 2-parity MDS codes, we develop a scheme that we call clique-repair which identifies good repair strategies using algebraic dependence properties that takes into the structure of the code. We use the above framework to efficiently repair the (5, 3)- and (6, 4)-Reed-Solomon (RS) codes.
展开▼