Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut

Jorge F Vázquez-Castellanos; Rodrigo García-López; Vicente Pérez-Brocal; Miguel Pignatelli; Andrés Moya

首页> 外文期刊>BMC Genomics >Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut

【24h】

Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut

机译：肠道中模拟病毒宏基因组群落分析中不同组装和注释工具的比较

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background The main limitations in the analysis of viral metagenomes are perhaps the high genetic variability and the lack of information in extant databases. To address these issues, several bioinformatic tools have been specifically designed or adapted for metagenomics by improving read assembly and creating more sensitive methods for homology detection. This study compares the performance of different available assemblers and taxonomic annotation software using simulated viral-metagenomic data. Results We simulated two 454 viral metagenomes using genomes from NCBI's RefSeq database based on the list of actual viruses found in previously published metagenomes. Three different assembly strategies, spanning six assemblers, were tested for performance: overlap-layout-consensus algorithms Newbler, Celera and Minimo; de Bruijn graphs algorithms Velvet and MetaVelvet; and read probabilistic model Genovo. The performance of the assemblies was measured by the length of resulting contigs (using N50), the percentage of reads assembled and the overall accuracy when comparing against corresponding reference genomes. Additionally, the number of chimeras per contig and the lowest common ancestor were estimated in order to assess the effect of assembling on taxonomic and functional annotation. The functional classification of the reads was evaluated by counting the reads that correctly matched the functional data previously reported for the original genomes and calculating the number of over-represented functional categories in chimeric contigs. The sensitivity and specificity of tBLASTx, PhymmBL and the k-mer frequencies were measured by accurate predictions when comparing simulated reads against the NCBI Virus genomes RefSeq database. Conclusions Assembling improves functional annotation by increasing accurate assignations and decreasing ambiguous hits between viruses and bacteria. However, the success is limited by the chimeric contigs occurring at all taxonomic levels. The assembler and its parameters should be selected based on the focus of each study. Minimo's non-chimeric contigs and Genovo's long contigs excelled in taxonomy assignation and functional annotation, respectively. tBLASTx stood out as the best approach for taxonomic annotation for virus identification. PhymmBL proved useful in datasets in which no related sequences are present as it uses genomic features that may help identify distant taxa. The k-frequencies underperformed in all viral datasets.

机译：背景技术病毒基因组分析的主要局限性可能是遗传变异性高以及现存数据库缺乏信息。为了解决这些问题，通过改进读取装配并创建用于同源性检测的更灵敏的方法，专门针对宏基因组学设计或修改了几种生物信息学工具。这项研究使用模拟的病毒-基因组学数据比较了各种可用的汇编器和分类注释软件的性能。结果我们使用了NCBI RefSeq数据库中的基因组，根据以前发布的元基因组中发现的实际病毒列表，模拟了两个454个病毒元基因组。测试了跨越六个组装程序的三种不同组装策略的性能：重叠布局共识算法Newbler，Celera和Minimo; de Bruijn绘制了算法Velvet和MetaVelvet;并阅读概率模型Genovo。组装的性能通过与相应参考基因组比较的所得重叠群的长度（使用N50），组装的读数的百分比以及总体准确性来衡量。另外，估计每个重叠群的嵌合体数目和最低的祖先，以评估组装对分类学和功能注释的影响。通过计数与先前报道的原始基因组的功能数据正确匹配的读数，并计算嵌合重叠群中过度代表的功能类别的数量，来评估这些读数的功能分类。当将模拟读数与NCBI病毒基因组RefSeq数据库进行比较时，通过准确的预测来测量tBLASTx，PhymmBL和k-mer频率的敏感性和特异性。结论组装可以通过增加准确的分配并减少病毒和细菌之间的歧义匹配来改善功能注释。但是，成功受到所有分类学水平上发生的嵌合重叠群的限制。应当根据每个研究的重点来选择汇编器及其参数。 Minimo的非嵌合重叠群和Genovo的长重叠群分别在分类分配和功能注释方面表现出色。 tBLASTx是用于病毒识别的分类注释的最佳方法。 PhymmBL在没有相关序列的数据集中被证明是有用的，因为它使用的基因组特征可以帮助识别远处的类群。在所有病毒数据集中，k频率均表现不佳。

著录项

来源
《BMC Genomics》 |2014年第1期|共页
作者
Jorge F Vázquez-Castellanos; Rodrigo García-López; Vicente Pérez-Brocal; Miguel Pignatelli; Andrés Moya;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类医学遗传学;
关键词

相似文献

外文文献
中文文献
专利

1. TheViral MetaGenome Annotation Pipeline(VMGAP):an automated tool for the functional annotation of viral Metagenomic shotgun sequencing data [J] . Hernan A. Lorenzi, Jeff Hoover, Jason Inman, Standards in Genomic Sciences . 2011,第3期

机译：病毒元基因组注释管道（VMGAP）：用于病毒元基因组shot弹枪测序数据功能注释的自动化工具
2. Viral communities of the human gut: metagenomic analysis of composition and dynamics [J] . Varun Aggarwala, Guanxiang Liang, Frederic D. Bushman Mobile DNA . 2017,第1期

机译：人类肠道病毒群落：组成和动力学的宏基因组分析
3. GenSeed-HMM: A Tool for Progressive Assembly Using Profile HMMs as Seeds and its Application in Alpavirinae Viral Discovery from Metagenomic Data [J] . Jo?o M. P. Alves, André L. de Oliveira, Tatiana O. M. Sandberg, Frontiers in Microbiology . 2016,第6期

机译：Genseed-HMM：使用型材HMMS作为种子的渐进组件的工具及其在<斜体> alpavirinae 来自梅塔群数据的病毒发现
4. Using metagenomic tools for biocontrol: analysis of moss- and lichen-associated microbial communities [C] . Gabriele Berg, Anastasia Bragina, Massimiliano Cardinale, Plant growth-promoting rhizobacteria (PGPR) for sustainable agriculture . 2011

机译：使用宏基因组学的工具进行生物防治：与苔藓和地衣相关的微生物群落分析
5. Data management tools for metagenomic datasets, a case study: Characterization and dysbiosis of gut microbiota in APCmin/+ mice with and without colon tumors. [D] . Dayama, Gargi. 2012

机译：用于宏基因组数据集的数据管理工具，一个案例研究：有或没有结肠肿瘤的APCmin / +小鼠中肠道菌群的表征和营养不良。
6. Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut [O] . Jorge F Vázquez-Castellanos, Rodrigo García-López, Vicente Pérez-Brocal, 2014

机译：肠道中模拟病毒宏基因组群落分析中不同组装和注释工具的比较
7. Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut [O] . Jorge F Vázquez-Castellanos, Rodrigo García-López, Vicente Pérez-Brocal, 2014

机译：肠道中模拟病毒宏基因组群落分析中不同组装和注释工具的比较

Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut

摘要

著录项

相似文献

相关主题

期刊订阅