The channel model of encoding data as a set of unordered strings is receiving great attention as it captures the basic features of DNA storage systems. However, the challenge of constructing optimal redundancy codes for this channel remained elusive. In this paper, we solve this open problem and present an order-wise optimal construction of codes that correct multiple substitution errors for this channel model. The key ingredient in the code construction is a technique we call robust indexing: instead of using fixed indices to create order in unordered strings, we use indices that are information dependent and thus eliminate unnecessary redundancy. In addition, our robust indexing technique can be applied to the construction of optimal deletion/insertion codes for this channel.
展开▼