首页> 美国卫生研究院文献>other >Implications of Pyrosequencing Error Correction for Biological Data Interpretation
【2h】

Implications of Pyrosequencing Error Correction for Biological Data Interpretation

机译:焦磷酸测序纠错的启示生物数据解读

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

There has been a rapid proliferation of approaches for processing and manipulating second generation DNA sequence data. However, users are often left with uncertainties about how the choice of processing methods may impact biological interpretation of data. In this report, we probe differences in output between two different processing pipelines: a de-noising approach using the AmpliconNoise algorithm for error correction, and a standard approach using quality filtering and preclustering to reduce error. There was a large overlap in reads culled by each method, although AmpliconNoise removed a greater net number of reads. Most OTUs produced by one method had a clearly corresponding partner in the other. Although each method resulted in OTUs consisting entirely of reads that were culled by the other method, there were many more such OTUs formed in the standard pipeline. Total OTU richness was reduced by AmpliconNoise processing, but per-sample OTU richness, diversity and evenness were increased. Increases in per-sample richness and diversity may be a result of AmpliconNoise processing producing a more even OTU rank-abundance distribution. Because communities were randomly subsampled to equalize sample size across communities, and because rare sequence variants are less likely to be selected during subsampling, fewer OTUs were lost from individual communities when subsampling AmpliconNoise-processed data. In contrast to taxon-based diversity estimates, phylogenetic diversity was reduced even on a per-sample basis by de-noising, and samples switched widely in diversity rankings. This work illustrates the significant impacts of processing pipelines on the biological interpretations that can be made from pyrosequencing surveys. This study provides important cautions for analyses of contemporary data, for requisite data archiving (processed vs. non-processed data), and for drawing comparisons among studies performed using distinct data processing pipelines.
机译:处理和处理第二代DNA序列数据的方法已经迅速扩散。但是,对于处理方法的选择如何影响数据的生物学解释,用户通常不确定。在本报告中,我们探讨了两个不同处理管道之间的输出差异:使用AmpliconNoise算法进行纠错的去噪方法,以及使用质量过滤和预聚类以减少错误的标准方法。尽管AmpliconNoise删除了更多的净读取数,但每种方法剔除的读取有很大的重叠。通过一种方法生产的大多数OTU在另一种方法中都有明确对应的伙伴。尽管每种方法产生的OTU完全由另一种方法剔除的读物组成,但在标准管道中形成了更多此类OTU。通过AmpliconNoise处理降低了总OTU丰富度,但增加了每个样本的OTU丰富度,多样性和均匀度。每个样本的丰富度和多样性的增加可能是AmpliconNoise处理产生了更均匀的OTU秩-丰度分布的结果。由于对社区进行了随机二次采样以使整个社区的样本量相等,并且由于在二次采样期间很少选择稀有序列变体,因此在对AmpliconNoise处理的数据进行二次采样时,单个社区损失的OTU更少。与基于分类单元的多样性估计相反,即使在每个样本的基础上,通过去噪也会减少系统发育多样性,并且样本在多样性排名中也发生了很大变化。这项工作说明了处理管道对焦磷酸测序调查可以做出的生物学解释的重大影响。这项研究为当代数据的分析,必要的数据归档(处理后的数据与未处理的数据)以及使用不同的数据处理管道进行的研究之间的比较提供了重要的注意事项。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号