...
首页> 外文期刊>Briefings in bioinformatics >Denoising DNA deep sequencing data-high-throughput sequencing errors and their correction
【24h】

Denoising DNA deep sequencing data-high-throughput sequencing errors and their correction

机译:对DNA深度测序数据进行去噪-高通量测序错误及其纠正

获取原文
获取原文并翻译 | 示例
           

摘要

Characterizing the errors generated by common high-throughput sequencing platforms and telling true genetic variation from technical artefacts are two interdependent steps, essential to many analyses such as single nucleotide variant calling, haplotype inference, sequence assembly and evolutionary studies. Both random and systematic errors can show a specific occurrence profile for each of the six prominent sequencing platforms surveyed here: 454 pyrosequencing, Complete Genomics DNA nanoball sequencing, Illumina sequencing by synthesis, Ion Torrent semiconductor sequencing, Pacific Biosciences single-molecule real-time sequencing and Oxford Nanopore sequencing. There is a large variety of programs available for error removal in sequencing read data, which differ in the error models and statistical techniques they use, the features of the data they analyse, the parameters they determine from them and the data structures and algorithms they use. We highlight the assumptions they make and for which data types these hold, providing guidance which tools to consider for benchmarking with regard to the data properties. While no benchmarking results are included here, such specific benchmarks would greatly inform tool choices and future software development. The development of stand-alone error correctors, as well as single nucleotide variant and haplotype callers, could also benefit from using more of the knowledge about error profiles and from (re)combining ideas from the existing approaches presented here.
机译:表征常见的高通量测序平台所产生的错误并告诉技术伪像鉴定真正的遗传变异是两个相互依赖的步骤,这对于许多分析(例如单核苷酸变异调用,单倍型推断,序列组装和进化研究)都是必不可少的。随机和系统错误都可以显示此处调查的六个主要测序平台各自的特定发生情况:454焦磷酸测序,完整基因组DNA纳米球测序,Illumina合成测序,离子洪流半导体测序,Pacific Biosciences单分子实时测序和牛津纳米孔测序。有多种程序可用于对读取的数据进行序列化中的错误消除,这些程序在使用的错误模型和统计技术,分析的数据的特征,从中确定的参数以及使用的数据结构和算法方面有所不同。我们重点介绍它们所做的假设以及这些假设所适用的数据类型,并提供指导以考虑使用哪些工具进行数据属性基准测试。尽管此处未包含基准测试结果,但此类特定的基准测试将极大地指导工具选择和未来的软件开发。独立错误校正器以及单核苷酸变异体和单倍型调用者的开发,也可以从使用更多有关错误概况的知识以及(重新)组合此处介绍的现有方法中获得的思想中受益。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号