首页> 美国卫生研究院文献>other >Critical Assessment of Metagenome Interpretation – a benchmark of computational metagenomics software
【2h】

Critical Assessment of Metagenome Interpretation – a benchmark of computational metagenomics software

机译:关键基因组学解释评估–计算宏基因组学软件的基准

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In metagenome analysis, computational methods for assembly, taxonomic profiling and binning are key components facilitating downstream biological data interpretation. However, a lack of consensus about benchmarking datasets and evaluation metrics complicates proper performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on datasets of unprecedented complexity and realism. Benchmark metagenomes were generated from ~700 newly sequenced microorganisms and ~600 novel viruses and plasmids, including genomes with varying degrees of relatedness to each other and to publicly available ones and representing common experimental setups. Across all datasets, assembly and genome binning programs performed well for species represented by individual genomes, while performance was substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below the family level. Parameter settings substantially impacted performances, underscoring the importance of program reproducibility. While highlighting current challenges in computational metagenomics, the CAMI results provide a roadmap for software selection to answer specific research questions.
机译:在元基因组分析中,组装,分类学分析和装仓的计算方法是促进下游生物学数据解释的关键组成部分。但是,对于基准数据集和评估指标缺乏共识,这会使正确的性能评估变得复杂。元基因组解释的关键评估(CAMI)挑战已吸引全球开发者社区在前所未有的复杂性和现实性的数据集上对他们的程序进行基准测试。基准元基因组是由约700个新测序的微生物和约600个新病毒和质粒产生的,其中包括彼此之间以及与公众可获得的基因具有不同程度的关联性的基因组,它们代表了常见的实验设置。在所有数据集中,装配和基因组装箱程序对于以单个基因组代表的物种表现良好,而性能则受到相关菌株的存在的显着影响。分类学分析和分类程序在高分类学级别上是熟练的,在家庭水平以下性能显着下降。参数设置极大地影响了性能,强调了程序可重复性的重要性。 CAMI的结果突出了计算宏基因组学的当前挑战,同时提供了选择软件的路线图,以回答特定的研究问题。

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号