The design of a large and reliable DNA codeword library is a key problem in DNA basedcomputing. DNA codes, namely sets of fixed length edit metric codewords over the alphabet{A, C, G, T}, satisfy certain combinatorial constraints with respect to biological andchemical restrictions of DNA strands. The primary constraints that we consider are thereverse--complement constraint and the fixed GC--content constraint, as well as the basicedit distance constraint between codewords.We focus on exploring the theory underlying DNA codes and discuss several approaches tosearching for optimal DNA codes. We use Conway's lexicode algorithm and an exhaustivesearch algorithm to produce provably optimal DNA codes for codes with small parametervalues. And a genetic algorithm is proposed to search for some sub--optimal DNA codeswith relatively large parameter values, where we can consider their sizes as reasonablelower bounds of DNA codes. Furthermore, we provide tables of bounds on sizes of DNAcodes with length from 1 to 9 and minimum distance from 1 to 9.
展开▼