首页> 美国卫生研究院文献>Protein Science : A Publication of the Protein Society >A categorization approach to automated ontological function annotation
【2h】

A categorization approach to automated ontological function annotation

机译:自动本体功能注释的分类方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Automated function prediction (AFP) methods increasingly use knowledge discovery algorithms to map sequence, structure, literature, and/or pathway information about proteins whose functions are unknown into functional ontologies, typically (a portion of) the Gene Ontology (GO). While there are a growing number of methods within this paradigm, the general problem of assessing the accuracy of such prediction algorithms has not been seriously addressed. We present first an application for function prediction from protein sequences using the POSet Ontology Categorizer (POSOC) to produce new annotations by analyzing collections of GO nodes derived from annotations of protein BLAST neighborhoods. We then also present hierarchical precision and hierarchical recall as new evaluation metrics for assessing the accuracy of any predictions in hierarchical ontologies, and discuss results on a test set of protein sequences. We show that our method provides substantially improved hierarchical precision (measure of predictions made that are correct) when applied to the nearest BLAST neighbors of target proteins, as compared with simply imputing that neighborhood's annotations to the target. Moreover, when our method is applied to a broader BLAST neighborhood, hierarchical precision is enhanced even further. In all cases, such increased hierarchical precision performance is purchased at a modest expense of hierarchical recall (measure of all annotations that get predicted at all).
机译:自动化功能预测(AFP)方法越来越多地使用知识发现算法来映射有关功能未知的蛋白质的序列,结构,文献和/或途径信息,这些信息通常是功能本体(GO)的一部分(功能本体)。尽管此范例中的方法越来越多,但尚未认真解决评估此类预测算法准确性的一般问题。我们首先介绍一个使用POSet本体分类器(POSOC)从蛋白质序列进行功能预测的应用,以通过分析从蛋白质BLAST邻域的注释派生的GO节点集合来产生新的注释。然后,我们还将提出分层精度和分层召回作为评估分层本体中任何预测的准确性的新评估指标,并讨论蛋白质序列测试集上的结果。我们表明,与简单地将邻域的注释推算至目标相比,当将其应用于目标蛋白的最近BLAST邻居时,我们的方法可显着提高分层精度(对预测的预测是正确的)。此外,当我们的方法应用于更广泛的BLAST邻域时,分层精度会进一步提高。在所有情况下,购买这种提高的层次精度性能都需要付出一定的代价,即要花费一定的层次回忆(度量所有注释的方法)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号