首页> 外文会议>7th International Conference on Natural Language Processing and Knowledge Engineering >Protein-Protein Interaction extraction based on ensemble kernel model and active learning strategy
【24h】

Protein-Protein Interaction extraction based on ensemble kernel model and active learning strategy

机译:基于集成核模型和主动学习策略的蛋白质相互作用提取

获取原文
获取原文并翻译 | 示例

摘要

Protein-Protein Interaction (PPI) extraction from biomedicine literature can supply the biomedicine researcher with useful information rapidly. This paper presents a PPI extraction system based on the ensemble kernel model and active learning. Firstly, the ensemble kernel within SVM classifier combines the lexical feature-based kernel and the path-based kernel. Experimental results show that the F-score of PPI extraction using ensemble kernel model on AIMED, IEPA and BCPPI corpora are 64.50%, 69.74% and 60.38% respectively with 10-fold cross-validation, which are better than the lexical feature-based kernel and the path-based kernel separately. As the above ensemble kernel model based on SVM needs large labeled data and it is expensive to label data manually, we integrate active learning into the ensemble kernel model. The active learning method uses the uncertainty-based sampling strategy. The experimental results integrating the active learning show that the F-score on AIMED, IEPA and BCPPI corpora are 65.24%, 70.19% and 61.87% respectively, which are better than those using the ensemble kernel model with the passive learning, and meantime reduce the labeling data by 20%, 30% and 30%, respectively.
机译:从生物医学文献中提取蛋白质-蛋白质相互作用(PPI)可为生物医学研究人员迅速提供有用的信息。本文提出了一种基于集成核模型和主动学习的PPI提取系统。首先,SVM分类器中的集成内核将基于词法特征的内核和基于路径的内核结合在一起。实验结果表明,采用集成核模型对AIMED,IEPA和BCPPI语料库进行PPI提取的F值分别为64.50%,69.74%和60.38%,交叉验证为10倍,优于基于词法特征的核和基于路径的内核。由于上述基于SVM的集成内核模型需要大量的标记数据,并且手动标记数据的成本很高,因此我们将主动学习集成到集成内核模型中。主动学习方法使用基于不确定性的采样策略。结合主动学习的实验结果表明,AIMED,IEPA和BCPPI语料库的F得分分别为65.24%,70.19%和61.87%,优于采用集成核模型和被动学习的F得分,同时降低了标签数据分别减少20%,30%和30%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号