首页> 外文期刊>BMC Genomics >Computational workflow for the fine-grained analysis of metagenomic samples
【24h】

Computational workflow for the fine-grained analysis of metagenomic samples

机译:用于宏基因组样本细粒度分析的计算工作流程

获取原文
           

摘要

Background The field of metagenomics, defined as the direct genetic analysis of uncultured samples of genomes contained within an environmental sample, is gaining increasing popularity. The aim of studies of metagenomics is to determine the species present in an environmental community and identify changes in the abundance of species under different conditions. Current metagenomic analysis software faces bottlenecks due to the high computational load required to analyze complex samples. Results A computational open-source workflow has been developed for the detailed analysis of metagenomes. This workflow provides new tools and datafile specifications that facilitate the identification of differences in abundance of reads assigned to taxa (mapping), enables the detection of reads of low-abundance bacteria (producing evidence of their presence), provides new concepts for filtering spurious matches, etc. Innovative visualization ideas for improved display of metagenomic diversity are also proposed to better understand how reads are mapped to taxa. Illustrative examples are provided based on the study of two collections of metagenomes from faecal microbial communities of adult female monozygotic and dizygotic twin pairs concordant for leanness or obesity and their mothers. Conclusions The proposed workflow provides an open environment that offers the opportunity to perform the mapping process using different reference databases. Additionally, this workflow shows the specifications of the mapping process and datafile formats to facilitate the development of new plugins for further post-processing. This open and extensible platform has been designed with the aim of enabling in-depth analysis of metagenomic samples and better understanding of the underlying biological processes.
机译:背景技术宏基因组学领域被定义为对环境样品中所含基因组的未经培养的样品进行直接遗传分析,其日益普及。宏基因组学研究的目的是确定环境群落中存在的物种,并确定不同条件下物种丰度的变化。当前的宏基因组分析软件由于分析复杂样本所需的高计算量而面临瓶颈。结果已经开发了一个计算开源工作流,用于详细分析元基因组。此工作流提供了新的工具和数据文件规范,可帮助识别分配给分类单元的丰富读物(映射)中的差异,启用对低丰度细菌的读物的检测(产生其存在的证据),提供用于过滤虚假匹配项的新概念还提出了创新的可视化构想,以改进宏基因组学多样性的显示,以更好地理解读段如何映射到分类单元。基于对成年女性单卵双胎和双卵双胎对的瘦弱或肥胖及其母亲的粪便微生物群落的两个集合的基因组的研究,提供了示例性例子。结论提议的工作流程提供了一个开放的环境,使您有机会使用不同的参考数据库执行映射过程。此外,此工作流程还显示了映射过程和数据文件格式的规范,以促进新插件的开发,以进行进一步的后处理。设计该开放和可扩展的平台的目的是能够对宏基因组学样本进行深入分析,并更好地理解基础生物学过程。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号