Background: The identification of inversions of DNA segments shorter than read length (e.g., 100 bp), defined a: micro-inversions (Mis), remains challenging for next-generation sequencing reads. It is acknowledged that Mis arc important genomic variation and may play roles in causing genetic disease. However, current alignment method are generally insensitive to detect Mis. Here we develop a novel tool, MID (Micro-Inversion Detector), to identify I in human genomes using next-generation sequencing reads.Results: The algorithm of MID is designed based on a dynamic programming path-finding approach. What mak< MID different from other variant detection tools is that MID can handle small Mis and multiple breakpoints withit an unmapped read. Moreover, MID improves reliability in low coverage data by integrating multiple samples. Ou evaluation demonstrated that MID outperforms Gustaf, which can currently detect inversions from 30 bp to 500 Conclusions: To our knowledge, MID is the first method that can efficiently and reliably identify Mis from unmappe short next-generation sequencing reads. MID is reliable on low coverage data, which is suitable for large-scale proje< such as the 1000 Genomes Project (1KGP). MID identified previously unknown Mis from the 1KGP that overlap with genes and regulatory elements in the human genome. We also identified Mis in cancer cell lines from Cancer Cell Li Encyclopedia (CCLE). Therefore our tool is expected to be useful to improve the study of Mis as a type of genetic variant in the human genome. The source code can be downloaded from: http://cqb.pku.edu.cn/ZhuLab/MID.
展开▼