首页> 外文期刊>Bioinformatics >ESG: extended similarity group method for automated protein function prediction
【24h】

ESG: extended similarity group method for automated protein function prediction

机译:ESG:用于自动蛋白质功能预测的扩展相似度组方法

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Importance of accurate automatic protein function prediction is ever increasing in the face of a large number of newly sequenced genomes and proteomics data that are awaiting biological interpretation. Conventional methods have focused on high sequence similarity-based annotation transfer which relies on the concept of homology. However, many cases have been reported that simple transfer of function from top hits of a homology search causes erroneous annotation. New methods are required to handle the sequence similarity in a more robust way to combine together signals from strongly and weakly similar proteins for effectively predicting function for unknown proteins with high reliability.Results: We present the extended similarity group (ESG) method, which performs iterative sequence database searches and annotates a query sequence with Gene Ontology terms. Each annotation is assigned with probability based on its relative similarity score with the multiple-level neighbors in the protein similarity graph. We will depict how the statistical framework of ESG improves the prediction accuracy by iteratively taking into account the neighborhood of query protein in the sequence similarity space. ESG outperforms conventional PSI-BLAST and the protein function prediction (PFP) algorithm. It is found that the iterative search is effective in capturing multiple-domains in a query protein, enabling accurately predicting several functions which originate from different domains.
机译:动机:面对大量等待生物学解释的新测序基因组和蛋白质组学数据,准确进行自动蛋白质功能预测的重要性不断提高。常规方法集中于依赖于同源性概念的基于高序列相似性的注释转移。但是,据报道许多情况表明,从同源性搜索的最佳匹配中简单转移功能会导致错误的注释。需要一种新的方法来以更强大的方式处理序列相似性,以将来自强和弱相似蛋白的信号组合在一起,从而以高可靠性有效预测未知蛋白的功能。结果:我们提出了扩展相似性组(ESG)方法,该方法可以执行迭代序列数据库使用基因本体术语搜索并注释查询序列。基于每个注释与蛋白质相似图中的多级邻居的相对相似性评分,为其分配概率。我们将描述ESG的统计框架如何通过迭代考虑序列相似性空间中查询蛋白的邻域来提高预测准确性。 ESG优于传统的PSI-BLAST和蛋白质功能预测(PFP)算法。发现迭代搜索在捕获查询蛋白中的多个结构域方面是有效的,从而能够准确地预测源自不同结构域的几个功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号