首页> 外文会议>International Conference on Information Retrieval and Knowledge Management >Investigation of Data Representation Methods with Machine Learning Algorithms for Biomedical Named Enttity Recognition
【24h】

Investigation of Data Representation Methods with Machine Learning Algorithms for Biomedical Named Enttity Recognition

机译:基于机器学习算法的生物医学命名实体识别数据表示方法的研究

获取原文

摘要

Biomedical entities recognition such as gene, protein, chemicals and diseases is the first and most fundamental biomedical literature mining task. Most of recent biomedical named entity recognition (Bio-NER) methods rely on predefined features which try to capture the specific surface properties of entity types. However, these empirically predefined feature sets differ between entity types and they are complex manually constructed which make their development costly. This paper presents a comparative evaluation of traditional feature representation method and new prototypical representation methods with three machine learning classifiers (Support Vector Machine (SVM), Naive Bayes (NB), and K-Nearest Neighbor (KNN)) for Bio-NER. Several comparative experiments are conducted on widely used standard Bio-NER dataset namely GENIA corpus. This paper demonstrates that prototypical word representation methods can be successfully used for Bio-NER. Experimental results show that the prototypical representation methods improved the performance of the three machine learning models. Finally, the experiments indicate that the SVM classifier with prototypical representation methods yields the best result.
机译:生物医学实体的识别,例如基因,蛋白质,化学物质和疾病,是最重要的生物医学文献挖掘任务。最近的大多数生物医学命名实体识别(Bio-NER)方法都依赖于预定义的功能,这些功能试图捕获实体类型的特定表面特性。但是,这些根据经验预定义的特征集在实体类型之间有所不同,并且它们是复杂的手动构建的,这使其开发成本很高。本文对Bio-NER的三种机器学习分类器(支持向量机(SVM),朴素贝叶斯(NB)和K最近邻(KNN))进行了传统特征表示方法和新原型表示方法的比较评估。在广泛使用的标准Bio-NER数据集即GENIA语料库上进行了一些比较实验。本文证明了原型词表示方法可以成功地用于Bio-NER。实验结果表明,原型表示方法提高了三种机器学习模型的性能。最后,实验表明,采用原型表示方法的SVM分类器可获得最佳结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号