...
首页> 外文期刊>Nucleic acids research >PyroHMMsnp: an SNP caller for Ion Torrent and 454 sequencing data
【24h】

PyroHMMsnp: an SNP caller for Ion Torrent and 454 sequencing data

机译:PyroHMMsnp:离子激流和454测序数据的SNP调用者

获取原文

摘要

Both 454 and Ion Torrent sequencers are capable of producing large amounts of long high-quality sequencing reads. However, as both methods sequence homopolymers in one cycle, they both suffer from homopolymer uncertainty and incorporation asynchronization. In mapping, such sequencing errors could shift alignments around homopolymers and thus induce incorrect mismatches, which have become a critical barrier against the accurate detection of single nucleotide polymorphisms (SNPs). In this article, we propose a hidden Markov model (HMM) to statistically and explicitly formulate homopolymer sequencing errors by the overcall, undercall, insertion and deletion. We use a hierarchical model to describe the sequencing and base-calling processes, and we estimate parameters of the HMM from resequencing data by an expectation-maximization algorithm. Based on the HMM, we develop a realignment-based SNP-calling program, termed PyroHMMsnp, which realigns read sequences around homopolymers according to the error model and then infers the underlying genotype by using a Bayesian approach. Simulation experiments show that the performance of PyroHMMsnp is exceptional across various sequencing coverages in terms of sensitivity, specificity and F1 measure, compared with other tools. Analysis of the human resequencing data shows that PyroHMMsnp predicts 12.9% more SNPs than Samtools while achieving a higher specificity. (http://code.google.com/p/pyrohmmsnp/).
机译:454和Ion Torrent测序仪均能够产生大量长而高质量的测序读数。然而,由于两种方法都在一个循环中对均聚物进行排序,因此它们均受到均聚物不确定性和掺入不同步性的困扰。在作图时,此类测序错误可能会使均聚物周围的排列发生位移,从而导致错误的错配,这已成为阻碍准确检测单核苷酸多态性(SNP)的关键障碍。在本文中,我们提出了一个隐马尔可夫模型(HMM),以统计,显式地通过改写,改写,插入和删除来表达均聚物测序错误。我们使用分层模型来描述排序和碱基调用过程,并通过期望最大化算法从重新排序数据中估计HMM的参数。基于HMM,我们开发了一个基于重新排列的SNP调用程序,称为PyroHMMsnp,该程序根据误差模型重新排列均聚物周围的读取序列,然后使用贝叶斯方法推断潜在的基因型。仿真实验表明,与其他工具相比,PyroHMMsnp在各种测序范围内的灵敏度,特异性和F 1 度量均具有出色的性能。对人类重测序数据的分析表明,PyroHMMsnp预测比Samtools多12.9%的SNP,同时实现更高的特异性。 (http://code.google.com/p/pyrohmmsnp/)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号