...
首页> 外文期刊>Briefings in bioinformatics >Robust and exact structural variation detection with paired-end and soft-clipped alignments: SoftSV compared with eight algorithms
【24h】

Robust and exact structural variation detection with paired-end and soft-clipped alignments: SoftSV compared with eight algorithms

机译:通过成对末端和软钳位比对进行稳健而精确的结构变异检测:SoftSV与八种算法相比

获取原文
获取原文并翻译 | 示例
           

摘要

Structural variation (SV) plays an important role in genetic diversity among the population in general and specifically in diseases such as cancer. Modern next-generation sequencing (NGS) technologies provide paired-end sequencing data at high depth with increasing read lengths. This development enabled the analysis of split-reads to detect SV breakpoints with single-nucleotide resolution. But ambiguous mappings and breakpoint sequences with further co-occurring mutations hamper split-read alignments against a reference sequence. The trade-off between high sensitivity and low false-positive rate is problematic and often requires a lot of fine-tuning of the analysis method based on knowledge about its algorithm and the characteristics of the data set. We present SoftSV, a method for exact breakpoint detection for small and large deletions, inversions, tandem duplications and inter-chromosomal translocations, which relies solely on the mutual alignment of soft-clipped reads within the neighborhood of discordantly mapped paired-end reads. Unlike other SV detection algorithms, our approach does not require thresholds regarding sequencing coverage or mapping quality. We evaluate SoftSV together with eight approaches (Breakdancer, Clever, CREST, Delly, GASVPro, Pindel, Socrates and SoftSearch) on simulated and real data sets. Our results show that sensitive and reliable SV detection is subject to many different factors like read length, sequence coverage and SV type. While most programs have their individual drawbacks, our greedy approach turns out to be the most robust and sensitive on many experimental setups. Sensitivities above 85% and positive predictive values between 80 and 100% could be achieved consistently for all SV types on simulated data sets starting at relatively short 75 bp reads and low 10-15x sequence coverage.
机译:一般而言,结构变异(SV)在人群的遗传多样性中发挥着重要作用,尤其是在癌症等疾病中。现代的下一代测序(NGS)技术可随着读取长度的增加而提供深度的配对末端测序数据。这项发展使对拆分读数的分析能够检测具有单核苷酸分辨率的SV断点。但是模棱两可的映射和断点序列以及其他同时发生的突变阻碍了阅读序列与参考序列的比对。在高灵敏度和低假阳性率之间进行权衡是有问题的,并且通常需要基于其算法和数据集特征的知识对分析方法进行大量的微调。我们提出了SoftSV,这是一种用于大小缺失,倒位,串联重复和染色体间易位的精确断点检测方法,该方法仅依赖于不一致映射的成对末端读段附近的软剪切读段的相互对齐。与其他SV检测算法不同,我们的方法不需要有关序列覆盖率或作图质量的阈值。我们在模拟和真实数据集上评估SoftSV以及八种方法(Breakdancer,Clever,CREST,Delly,GASVPro,Pindel,Socrates和SoftSearch)。我们的结果表明,灵敏可靠的SV检测受许多不同因素的影响,例如读取长度,序列覆盖率和SV类型。尽管大多数程序都有其各自的缺点,但我们的贪婪方法在许多实验设置中却是最健壮和最敏感的。从相对较短的75 bp读数和较低的10-15x序列覆盖范围开始,对于模拟数据集上的所有SV类型,都可以始终如一地实现超过85%的灵敏度和80%至100%的阳性预测值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号