...
首页> 外文期刊>BMC Bioinformatics >Base calling for high-throughput short-read sequencing: dynamic programming solutions
【24h】

Base calling for high-throughput short-read sequencing: dynamic programming solutions

机译:高通量短读测序的基础研究:动态编程解决方案

获取原文

摘要

Background Next-generation DNA sequencing platforms are capable of generating millions of reads in a matter of days at rapidly reducing costs. Despite its proliferation and technological improvements, the performance of next-generation sequencing remains adversely affected by the imperfections in the underlying biochemical and signal acquisition procedures. To this end, various techniques, including statistical methods, are used to improve read lengths and accuracy of these systems. Development of high performing base calling algorithms that are computationally efficient and scalable is an ongoing challenge. Results We develop model-based statistical methods for fast and accurate base calling in Illumina’s next-generation sequencing platforms. In particular, we propose a computationally tractable parametric model which enables dynamic programming formulation of the base calling problem. Forward-backward and soft-output Viterbi algorithms are developed, and their performance and complexity are investigated and compared with the existing state-of-the-art base calling methods for this platform. A C code implementation of our algorithm named Softy can be downloaded from https://sourceforge.net/projects/dynamicprog webcite . Conclusion We demonstrate high accuracy and speed of the proposed methods on reads obtained using Illumina’s Genome Analyzer II and HiSeq2000. In addition to performing reliable and fast base calling, the developed algorithms enable incorporation of prior knowledge which can be utilized for parameter estimation and is potentially beneficial in various downstream applications.
机译:背景技术下一代DNA测序平台能够在几天之内以快速降低的成本生成数百万个读数。尽管它的扩散和技术进步,但下一代测序的性能仍然受到基础生化和信号采集程序中缺陷的不利影响。为此,包括统计方法在内的各种技术被用于改善这些系统的读取长度和准确性。开发高效的计算调用和可扩展的基本调用算法是一项持续的挑战。结果我们开发了基于模型的统计方法,可在Illumina的下一代测序平台中快速准确地进行碱基鉴定。特别是,我们提出了一种易于计算的参数模型,该模型可以实现基本调用问题的动态编程公式化。开发了向前和向后输出的维特比算法,并研究了它们的性能和复杂性,并与该平台的现有最先进的基本调用方法进行了比较。可以从https://sourceforge.net/projects/dynamicprog webcite下载名为Softy的算法的C代码实现。结论我们证明了使用Illumina的Genome Analyzer II和HiSeq2000获得的读数具有较高的准确性和速度。除了执行可靠和快速的基本调用外,开发的算法还可以合并可用于参数估计的先验知识,并且可能在各种下游应用中受益。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号