The rapid advance of next-generation sequencing (NGS) technologies has decreased the cost of genomic sequencing dramatically, enabling accurate variant discovery across whole genomes of many individuals. Current large-scale and cost-effective resequencing platforms produce reads of limited length, and as a result, reliable identification of variants within highly homologous regions of a target genome remains challenging. The 1000 Genomes Consortium has identified nearly 171Mbp (6% of the GRCh37 build) which is inaccessible by short read technologies . Further studies have shown that this number is upwards of 10% for accurate variant discovery.
展开▼