首页> 外文期刊>Bioinformatics >Correction of sequencing errors in a mixed set of reads
【24h】

Correction of sequencing errors in a mixed set of reads

机译:纠正混合读物中的测序错误

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: High-throughput sequencing technologies produce large sets of short reads that may contain errors. These sequencing errors make de novo assembly challenging. Error correction aims to reduce the error rate prior assembly. Many de novo sequencing projects use reads from several sequencing technologies to get the benefits of all used technologies and to alleviate their shortcomings. However, combining such a mixed set of reads is problematic as many tools are specific to one sequencing platform. The SOLiD sequencing platform is especially problematic in this regard because of the two base color coding of the reads. Therefore, new tools for working with mixed read sets are needed.Results: We present an error correction tool for correcting substitutions, insertions and deletions in a mixed set of reads produced by various sequencing platforms. We first develop a method for correcting reads from any sequencing technology producing base space reads such as the SOLEXA/Illumina and Roche/454 Life Sciences sequencing platforms. We then further re. ne the algorithm to correct the color space reads from the Applied Biosystems SOLiD sequencing platform together with normal base space reads. Our new tool is based on the SHREC program that is aimed at correcting SOLEXA/Illumina reads. Our experiments show that we can detect errors with 99% sensitivity and >98% specificity if the combined sequencing coverage of the sets is at least 12. We also show that the error rate of the reads is greatly reduced.Availability: The JAVA source code is freely available at http://www.cs.helsinki../u/lmsalmel/hybrid-shrec/Contact: leena.salmela@cs.helsinki.
机译:动机:高通量测序技术会产生大量短读,其中可能包含错误。这些测序错误使从头组装更具挑战性。纠错旨在减少组装前的错误率。许多从头测序项目都使用几种测序技术的读段来获得所有使用过的技术的好处并减轻其缺点。但是,由于许多工具特定于一个测序平台,因此将这样一组混合的读取组合在一起是有问题的。在这方面,由于读取的两个基本颜色编码,SOLiD测序平台尤其成问题。因此,需要用于混合阅读集的新工具。结果:我们提供了一种纠错工具,用于纠正由各种测序平台产生的混合阅读集中的替换,插入和删除。我们首先开发一种可纠正产生碱基空间读数的任何测序技术(例如SOLEXA / Illumina和Roche / 454 Life Sciences测序平台)中的读数的方法。然后,我们进一步重新。用于校正从Applied Biosystems SOLiD测序平台读取的色彩空间的算法以及正常的基础空间读数。我们的新工具基于SHREC程序,旨在纠正SOLEXA / Illumina读数。我们的实验表明,如果集合的组合测序覆盖率至少为12,则我们可以99%的灵敏度和> 98%的特异性检测错误。我们还显示,读取的错误率大大降低了。可用性:JAVA源代码可从http://www.cs.helsinki../u/lmsalmel/hybrid-shrec/免费获取:Contact:leena.salmela@cs.helsinki。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号