首页> 外文期刊>DNA research: an international journal for rapid publication of reports on genes and genomes >Heap: a highly sensitive and accurate SNP detection tool for low-coverage high-throughput sequencing data
【24h】

Heap: a highly sensitive and accurate SNP detection tool for low-coverage high-throughput sequencing data

机译:堆:用于低覆盖的高吞吐量排序数据的高度敏感和准确的SNP检测工具

获取原文
获取原文并翻译 | 示例
           

摘要

Recent availability of large-scale genomic resources enables us to conduct so called genome-wide association studies (GWAS) and genomic prediction (GP) studies, particularly with next-generation sequencing (NGS) data. The effectiveness of GWAS and GP depends on not only their mathematical models, but the quality and quantity of variants employed in the analysis. In NGS single nucleotide polymorphism (SNP) calling, conventional tools ideally require more reads for higher SNP sensitivity and accuracy. In this study, we aimed to develop a tool, Heap, that enables robustly sensitive and accurate calling of SNPs, particularly with a low coverage NGS data, which must be aligned to the reference genome sequences in advance. To reduce false positive SNPs, Heap determines genotypes and calls SNPs at each site except for sites at the both ends of reads or containing a minor allele supported by only one read. Performance comparison with existing tools showed that Heap achieved the highest F-scores with low coverage (7X) restriction-site associated DNA sequencing reads of sorghum and rice individuals. This will facilitate cost-effective GWAS and GP studies in this NGS era. Code and documentation of Heap are freely available from https://github.com/meiji-bioinf/heap (29 March 2017, date last accessed) and our web site (http://bioinf.mind.meiji.ac.jp/lab/en/tools.html (29 March 2017, date last accessed)).
机译:最近的大规模基因组资源的可用性使我们能够进行所谓的基因组关联研究(GWAS)和基因组预测(GP)研究,特别是在下一代测序(NGS)数据中。 GWAS和GP的有效性不仅取决于它们的数学模型,而且取决于分析中所采用的变体的质量和数量。在NGS单核苷酸多态性(SNP)呼叫中,常规工具理想地需要更多的读取以获得更高的SNP灵敏度和精度。在这项研究中,我们旨在开发一个工具,堆,这使得能够稳健敏感和准确地呼叫SNP,特别是具有低覆盖NGS数据,其必须预先将其对准至参考基因组序列。为了减少假阳性SNP,堆在每个站点上确定基因型并呼叫SNP,除了读取的两端的网站,或包含仅通过仅读取的次要等位基因。与现有工具的性能比较显示,堆实现了具有低覆盖率(7倍)限制性位点相关的DNA测序读取高粱和水稻个体的最高F分数。这将促进这个NGS时代的成本效益的GWA和GP研究。堆的代码和文档可从https://github.com/meiji-bioinf/heap(2017年3月29日,上次访问日期)和我们的网站(http://bioinf.mind.miniji.ac.jp/ Lab / en / tools.html(2017年3月29日,上次访问日期))。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号