首页> 外文会议>Conference of the European Chapter of the Association for Computational Linguistics >Entity Extraction in Biomedical Corpora: An Approach to Evaluate Word Embedding Features with PSO based Feature Selection
【24h】

Entity Extraction in Biomedical Corpora: An Approach to Evaluate Word Embedding Features with PSO based Feature Selection

机译:生物医学技术中的实体提取:一种评估基于PSO的特征选择词嵌入功能的方法

获取原文

摘要

Text mining has drawn significant atten tion in recent past due to the rapid growth in biomedical and clinical records. Entity extraction is one of the fundamental com ponents for biomedical text mining. In this paper, we propose a novel approach of feature selection for entity extraction that exploits the concept of deep learning and Particle Swarm Optimization (PSO). The system utilizes word embedding features along with several other features extracted by studying the properties of the datasets. We obtain an interesting observation that compact word embedding features as de termined by PSO are more effective com pared to the entire word embedding fea ture set for entity extraction. The pro posed system is evaluated on three bench mark biomedical datasets such as GENIA, GENETAG and AiMed. The effective ness of the proposed approach is evident with significant performance gains over the baseline models as well as the other ex isting systems. We observe improvements of 7.86%, 5.27% and 7.25% F-measure points over the baseline models for GE NIA, GENETAG, and AiMed dataset re spectively.
机译:由于生物医学和临床记录的快速增长,文本挖掘在过去的过去造成了显着的效力。实体提取是生物医学文本挖掘的基本组合。在本文中,我们提出了一种新的特征选择方法,用于利用深度学习和粒子群优化概念(PSO)的概念。该系统利用单词嵌入功能以及通过研究数据集的属性提取的几个其他功能。我们获得了一个有趣的观察,即PSO所定位的Comply Word嵌入功能是更有效的COM削减了嵌入用于实体提取的FEA TURE设置的整个单词。 Pro姿势系统在三个台面标记生物医学数据集(如Genia,Genetag和瞄准)上进行评估。所提出的方法的有效性是显而易见的,在基线模型以及其他EX的系统上具有显着性能。我们观察GE NIA,GENETAG和AIMASET的基线模型中的7.86%,5.27%和7.25%的F测量点的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号