...
首页> 外文期刊>Computational and Structural Biotechnology Journal >CHROMATOGATE: A {TOOL} {FOR} {DETECTING} {BASE} MIS-CALLS {IN} {MULTIPLE} {SEQUENCE} {ALIGNMENTS} {BY} SEMI-AUTOMATIC {CHROMATOGRAM} {INSPECTION}
【24h】

CHROMATOGATE: A {TOOL} {FOR} {DETECTING} {BASE} MIS-CALLS {IN} {MULTIPLE} {SEQUENCE} {ALIGNMENTS} {BY} SEMI-AUTOMATIC {CHROMATOGRAM} {INSPECTION}

机译:色谱:{工具} {用于} {检测} {基础} MIS调用{输入} {多个} {序列} {对齐} {BY}半自动{色谱} {检查}

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Automated {DNA} sequencers generate chromatograms that contain raw sequencing data. They also generate data that translates the chromatograms into molecular sequences of A, C, G, T, or N (undetermined) characters. Since chromatogram translation programs frequently introduce errors, a manual inspection of the generated sequence data is required. As sequence numbers and lengths increase, visual inspection and manual correction of chromatograms and corresponding sequences on a per-peak and per-nucleotide basis becomes an error-prone, time-consuming, and tedious process. Here, we introduce ChromatoGate (CG), an open-source software that accelerates and partially automates the inspection of chromatograms and the detection of sequencing errors for bidirectional sequencing runs. To provide users full control over the error correction process, a fully automated error correction algorithm has not been implemented. Initially, the program scans a given multiple sequence alignment (MSA) for potential sequencing errors, assuming that each polymorphic site in the alignment may be attributed to a sequencing error with a certain probability. The guided {MSA} assembly procedure in ChromatoGate detects chromatogram peaks of all characters in an alignment that lead to polymorphic sites, given a user-defined threshold. The threshold value represents the sensitivity of the sequencing error detection mechanism. After this pre-filtering, the user only needs to inspect a small number of peaks in every chromatogram to correct sequencing errors. Finally, we show that correcting sequencing errors is important, because population genetic and phylogenetic inferences can be misled by {MSAs} with uncorrected mis-calls. Our experiments indicate that estimates of population mutation rates can be affected two- to three-fold by uncorrected errors.
机译:自动的{DNA}测序仪会生成包含原始测序数据的色谱图。它们还生成将色谱图转换为A,C,G,T或N(未确定)字符的分子序列的数据。由于色谱图转换程序经常会引入错误,因此需要对生成的序列数据进行人工检查。随着序列号和长度的增加,在每个峰和每个核苷酸的基础上进行色谱图和相应序列的目视检查和手动校正变得容易出错,耗时且繁琐。在这里,我们介绍ChromatoGate(CG),这是一种开源软件,可加速并部分自动化色谱图的检查和双向测序运行的测序错误的检测。为了向用户提供对纠错过程的完全控制,尚未实现全自动纠错算法。最初,该程序会扫描给定的多序列比对(MSA)以查找潜在的测序错误,假设比对中的每个多态性位点都可能以一定概率归因于测序错误。在给定的用户定义阈值的情况下,ChromatoGate中指导的{MSA}组装过程将检测比对中所有字符的色谱峰,这些峰导致多态性位点。阈值代表测序错误检测机制的灵敏度。经过此预过滤后,用户只需检查每个色谱图中的少量峰即可纠正测序错误。最后,我们表明,纠正测序错误非常重要,因为{MSA}可能会误导群体遗传和系统发育推断,而产生未纠正的错误提示。我们的实验表明,人口突变率的估计值可能会受到未校正错误的2到3倍的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号