首页> 中文期刊> 《计算机技术与发展》 >基于关系相似性的蛋白质交互作用识别

基于关系相似性的蛋白质交互作用识别

         

摘要

For the deficiencies of current approaches on Protein-Protein Interaction ( PPI) identification which based on single sentences, propose a relational similarity method for automatic identification of protein-protein interactions by searching large scale text. The signa-ture of a protein pair is obtained first by searching large scale biomedical text. Then,features are extracted from the signatures to build the vector space model of the protein pair. Finally,K nearest neighbor classifier is applied to identify PPIs. The influence of various distance measurement strategies under vector space model on classification effect are compared and the rational similar function is obtained. Results show that this approach achieves high and well balanced precision and recall when taking cosine as the similarity measurement. In addi-tion,this approach makes use of known PPIs thus releases the burden of manual annotation.%针对目前蛋白质提取方法仅以单句信息为依据的不足,文中提出了以相似性为框架基于大规模文本的蛋白质交互关系识别方法。首先通过搜索医学文献数据库建立蛋白质对的签名档,然后提取签名档中的重要特征建立蛋白质对的向量空间模型,最后通过K近邻分类方法判断蛋白质对的交互关系。实验比较了向量空间模型下不同的距离度量策略对分类效果的影响,得出了比较合理的衡量相似性的函数。结果表明基于大规模文本采用基于余弦距离度量相似性的近邻方法识别蛋白质交互关系取得了较高且均衡的精确度和召回率,并且此方法直接利用了已有的交互信息,从而免除了额外的人工标注负担。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号