首页> 外文会议>WABI 2011 >Separating Metagenomic Short Reads into Genomes via Clustering (Extended Abstract)
【24h】

Separating Metagenomic Short Reads into Genomes via Clustering (Extended Abstract)

机译:通过聚类分离偏心组织短读入基因组(扩展摘要)

获取原文

摘要

Abstract. The metagenomics approach allows the simultaneous sequencing of all genomes in an environmental sample. This results in high complexity datasets, where in addition to repeats and sequencing errors, the number of genomes and their abundance ratios are unknown. Recently developed next-generation sequencing (NGS) technologies significantly improve the sequencing efficiency and cost. On the other hand, they result in shorter reads, which makes the separation of reads from different species harder. In this work, we present a two-phase heuristic algorithm for separating short paired-end reads from different genomes in a metagenomic dataset. We use the observation that most of the l-mers belong to unique genomes when I is sufficiently large. The first phase of the algorithm results in clusters of Z-mers each of which belongs to one genome. During the second phase, clusters are merged based on l-mer repeat information. These final clusters are used to assign reads. The algorithm could handle very short reads and sequencing errors. Our tests on a large number of simulated metagenomic datasets concerning species at various phylogenetic distances demonstrate that genomes can be separated if the number of common repeats is smaller than the number of genome-specific repeats. For such genomes, our method can separate NGS reads with a high precision and sensitivity.
机译:抽象的。 Metagenomics方法允许同时对环境样品中的所有基因组进行测序。这导致高复杂性数据集,其中除了重复和测序误差之外,基因组的数量及其丰度差别是未知的。最近开发了下一代测序(NGS)技术显着提高了测序效率和成本。另一方面,它们导致较短的读数,这使得从不同物种的分离更加困难。在这项工作中,我们提出了一种用于将来自不同基因组中的短对结束读取的两相启发式算法在梅塔古元数据集中分离。我们使用观察,即当我足够大时,大多数L-MER属于独特的基因组。算法的第一阶段导致Z-MERS的簇属于一个基因组。在第二阶段期间,基于L-MEL重复信息合并集群。这些最终群集用于分配读数。该算法可以处理非常短的读取和排序错误。我们对各种系统发育距离的物种的大量模拟的偏瘫瘤数据集的测试表明,如果常见重复的数量小于特异性常见的重复的数量,则可以分离基因组。对于这种基因组,我们的方法可以以高精度和灵敏度分开NGS读数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号