首页> 美国卫生研究院文献>Bioinformatics >MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample
【2h】

MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample

机译:MetaCluster 5.0:针对嘈杂样本中低丰度物种的宏基因组数据的两轮装箱方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Motivation: Metagenomic binning remains an important topic in metagenomic analysis. Existing unsupervised binning methods for next-generation sequencing (NGS) reads do not perform well on (i) samples with low-abundance species or (ii) samples (even with high abundance) when there are many extremely low-abundance species. These two problems are common for real metagenomic datasets. Binning methods that can solve these problems are desirable.>Results: We proposed a two-round binning method (MetaCluster 5.0) that aims at identifying both low-abundance and high-abundance species in the presence of a large amount of noise due to many extremely low-abundance species. In summary, MetaCluster 5.0 uses a filtering strategy to remove noise from the extremely low-abundance species. It separate reads of high-abundance species from those of low-abundance species in two different rounds. To overcome the issue of low coverage for low-abundance species, multiple w values are used to group reads with overlapping w-mers, whereas reads from high-abundance species are grouped with high confidence based on a large w and then binning expands to low-abundance species using a relaxed (shorter) w. Compared to the recent tools, TOSS and MetaCluster 4.0, MetaCluster 5.0 can find more species (especially those with low abundance of say 6× to 10×) and can achieve better sensitivity and specificity using less memory and running time.>Availability: >Contact:
机译:>动机:元基因组分级仍然是宏基因组学分析中的重要主题。现有的用于下一代测序(NGS)的无监督分箱方法在(i)具有低丰度物种的样本或(ii)具有许多极低丰度物种的样本(甚至高丰度)上表现不佳。对于实际的宏基因组数据集,这两个问题很常见。希望能够解决这些问题的分箱方法。>结果:我们提出了一种两轮分箱方法(MetaCluster 5.0),该方法旨在在存在大量生物时识别低丰度和高丰度物种许多极低丰度的物种产生的大量噪声。总而言之,MetaCluster 5.0使用了一种过滤策略来消除来自极低丰度物种的噪声。它在两个不同的回合中将高丰度物种的读数与低丰度物种的读数分开。为了克服低丰度物种的低覆盖率问题,使用多个w值将重叠w-mer的读数分组,而将高丰度物种的读数基于大w进行高置信度分组,然后分箱扩展为低使用轻松(较短)w的丰富物种。与最新的工具TOSS和MetaCluster 4.0相比,MetaCluster 5.0可以发现更多的物种(尤其是那些丰度较低(例如6x到10x)的物种),并且可以使用更少的内存和运行时间来获得更好的灵敏度和特异性。 >联系方式:

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号