首页> 美国卫生研究院文献>Standards in Genomic Sciences >VIROME: a standard operating procedure for analysis of viral metagenome sequences
【2h】

VIROME: a standard operating procedure for analysis of viral metagenome sequences

机译:VIROME:用于分析病毒基因组序列的标准操作程序

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

One consistent finding among studies using shotgun metagenomics to analyze whole viral communities is that most viral sequences show no significant homology to known sequences. Thus, bioinformatic analyses based on sequence collections such as GenBank nr, which are largely comprised of sequences from known organisms, tend to ignore a majority of sequences within most shotgun viral metagenome libraries. Here we describe a bioinformatic pipeline, the Viral Informatics Resource for Metagenome Exploration (VIROME), that emphasizes the classification of viral metagenome sequences (predicted open-reading frames) based on homology search results against both known and environmental sequences. Functional and taxonomic information is derived from five annotated sequence databases which are linked to the UniRef 100 database. Environmental classifications are obtained from hits against a custom database, MetaGenomes On-Line, which contains 49 million predicted environmental peptides. Each predicted viral metagenomic ORF run through the VIROME pipeline is placed into one of seven ORF classes, thus, every sequence receives a meaningful annotation. Additionally, the pipeline includes quality control measures to remove contaminating and poor quality sequence and assesses the potential amount of cellular DNA contamination in a viral metagenome library by screening for rRNA genes. Access to the VIROME pipeline and analysis results are provided through a web-application interface that is dynamically linked to a relational back-end database. The VIROME web-application interface is designed to allow users flexibility in retrieving sequences (reads, ORFs, predicted peptides) and search results for focused secondary analyses.
机译:在使用shot弹枪宏基因组学分析整个病毒群落的研究中,一个一致的发现是,大多数病毒序列与已知序列没有显着同源性。因此,基于诸如GenBank nr之类的序列集合的生物信息学分析主要由来自已知生物体的序列组成,往往会忽略大多数shot弹枪病毒元基因组文库中的大多数序列。在这里,我们描述了一条生物信息学流水线,即用于元基因组探索的病毒信息学资源(VIROME),它强调了基于针对已知序列和环境序列的同源性搜索结果对病毒元基因组序列(预测的开放阅读框)进行分类。功能和分类信息来自与UniRef 100数据库链接的五个带注释的序列数据库。环境分类是从定制数据库MetaGenomes On-Line的命中获得的,该数据库包含4900万个预测的环境肽。通过VIROME管道运行的每个预测的病毒宏基因组ORF都被放置在七个ORF类之一中,因此,每个序列都会收到有意义的注释。此外,该管道还包括质量控制措施,以消除污染和质量差的序列,并通过筛选rRNA基因来评估病毒元基因组文库中细胞DNA污染的潜在量。通过动态链接到关系后端数据库的Web应用程序界面,可以访问VIROME管道和分析结果。 VIROME Web应用程序界面旨在使用户能够灵活地检索序列(读数,ORF,预测的肽段)和搜索结果,以进行有重点的二次分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号