首页> 外文期刊>Nucleic acids research >A novel conceptual approach to read-filtering in high-throughput amplicon sequencing studies
【24h】

A novel conceptual approach to read-filtering in high-throughput amplicon sequencing studies

机译:高通量扩增子测序研究中一种新颖的读取过滤概念方法

获取原文
           

摘要

Adequate read filtering is critical when processing high-throughput data in marker-gene-based studies. Sequencing errors can cause the mis-clustering of otherwise similar reads, artificially increasing the number of retrieved Operational Taxonomic Units (OTUs) and therefore leading to the overestimation of microbial diversity. Sequencing errors will also result in OTUs that are not accurate reconstructions of the original biological sequences. Herein we present the Poisson binomial filtering algorithm (PBF), which minimizes both problems by calculating the error-probability distribution of a sequence from its quality scores. In order to validate our method, we quality-filtered 37 publicly available datasets obtained by sequencing mock and environmental microbial communities with the Roche 454, Illumina MiSeq and IonTorrent PGM platforms, and compared our results to those obtained with previous approaches such as the ones included in mothur, QIIME and USEARCH. Our algorithm retained substantially more reads than its predecessors, while resulting in fewer and more accurate OTUs. This improved sensitiveness produced more faithful representations, both quantitatively and qualitatively, of the true microbial diversity present in the studied samples. Furthermore, the method introduced in this work is computationally inexpensive and can be readily applied in conjunction with any existent analysis pipeline.
机译:在基于标记物基因的研究中处理高通量数据时,充分的读取过滤至关重要。测序错误可能导致其他类似读物的错误分类,人为地增加了检索到的操作分类单位(OTU)的数量,因此导致对微生物多样性的高估。测序错误还将导致OTU不能正确重建原始生物序列。在这里,我们提出了泊松二项式滤波算法(PBF),该算法通过根据序列的质量得分计算序列的错误概率分布来最小化两个问题。为了验证我们的方法,我们对通过Roche 454,Illumina MiSeq和IonTorrent PGM平台对模拟和环境微生物群落进行测序而获得的37个可公开获得的数据集进行了质量过滤,并将我们的结果与通过以前的方法(包括以下方法)获得的结果进行了比较在mothur,QIIME和USEARCH。我们的算法保留了比其前辈更多的读取次数,同时导致更少和更准确的OTU。这种提高的敏感性从数量和质量上更加真实地反映了研究样品中存在的真实微生物多样性。此外,这项工作中介绍的方法在计算上不昂贵,并且可以很容易地与任何现有的分析管道一起应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号