首页> 外文期刊>DNA research: an international journal for rapid publication of reports on genes and genomes >A Novel Bioinformatics Strategy for Function Prediction of Poorly-Characterized Protein Genes Obtained from Metagenome Analyses
【24h】

A Novel Bioinformatics Strategy for Function Prediction of Poorly-Characterized Protein Genes Obtained from Metagenome Analyses

机译:一种新的生物信息学策略,用于从元基因组分析中获得功能较弱的蛋白质基因的功能预测

获取原文
           

摘要

As a result of remarkable progresses of DNA sequencing technology, vast quantities of genomic sequences have been decoded. Homology search for amino acid sequences, such as BLAST, has become a basic tool for assigning functions of genes/proteins when genomic sequences are decoded. Although the homology search has clearly been a powerful and irreplaceable method, the functions of only 50% or fewer of genes can be predicted when a novel genome is decoded. A prediction method independent of the homology search is urgently needed. By analyzing oligonucleotide compositions in genomic sequences, we previously developed a modified Self-Organizing Map ‘BLSOM' that clustered genomic fragments according to phylotype with no advance knowledge of phylotype. Using BLSOM for di-, tri- and tetrapeptide compositions, we developed a system to enable separation (self-organization) of proteins by function. Analyzing oligopeptide frequencies in proteins previously classified into COGs (clusters of orthologous groups of proteins), BLSOMs could faithfully reproduce the COG classifications. This indicated that proteins, whose functions are unknown because of lack of significant sequence similarity with function-known proteins, can be related to function-known proteins based on similarity in oligopeptide composition. BLSOM was applied to predict functions of vast quantities of proteins derived from mixed genomes in environmental samples.
机译:由于DNA测序技术的显着进步,大量的基因组序列已被解码。同源性搜索氨基酸序列(例如BLAST)已成为在解码基因组序列时分配基因/蛋白质功能的基本工具。尽管同源搜索显然已经是一种强大且不可替代的方法,但是当解码一个新的基因组时,只能预测到50%或更少的基因的功能。迫切需要一种与同源性搜索无关的预测方法。通过分析基因组序列中的寡核苷酸组成,我们先前开发了一种改进的自组织图'BLSOM',该图根据系统型将基因组片段聚类,而无需系统型的先验知识。使用BLSOM的二肽,三肽和四肽组合物,我们开发了一种系统,能够按功能分离蛋白质(自组织)。通过分析先前分类为COG(蛋白质直系同源簇)的蛋白质中的寡肽频率,BLSOM可以忠实地复制COG分类。这表明,由于缺乏与功能已知蛋白的显着序列相似性而功能未知的蛋白,可以基于寡肽组成的相似性与功能已知蛋白相关。 BLSOM用于预测环境样品中混合基因组中大量蛋白质的功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号