首页> 外文会议>Seventh International Conference on Semantics, Knowledge, and Grids >The Method of Medical Named Entity Recognition Based on Semantic Model and Improved SVM-KNN Algorithm
【24h】

The Method of Medical Named Entity Recognition Based on Semantic Model and Improved SVM-KNN Algorithm

机译:基于语义模型和改进的SVM-KNN算法的医学命名实体识别方法

获取原文

摘要

In the medical field, a lot of unstructured information which is expressed by natural language exists in medical literature, technical documentation and medical records. IE (Information Extraction) as one of the most important research directions in natural language process aims to help humans extract concerned information automatically. NER (Named Entity Recognition) is one of the subsystems of IE and has direct influence on the quality of IE. Nowadays NER of medical field has not reached ideal precision largely due to the knowledge complexity in medical field. It is hard to describe the medical entity definitely in feature selection and current selected features are not rich and lack of semantic information. Besides that, different medical entities which have the similar characters more easily cause classification algorithm making wrong judgment. Combination multi classification algorithms such as SVM-KNN can overcome its own disadvantages and get higher performance. But current SVM-KNN classification algorithm may provide wrong categories results due to setting inappropriate K value and unbalanced examples distribution. In this paper, we introduce two-level modelling tool to help specialists to build semantic models and select features from them. We design and implement medical named entity recognition analysis engine based on UIMA framework and adopt improved SVM-KNN algorithm called EK-SVM-KNN (Extending K SVM-KNN) in classification. We collect experiment data from SLE(Systemic Lupus Erythematosus) clinical information system in Renji Hospital. We adopt 50 Pathology reports as training data and provide 1000 Pathology as test data. Experiment shows medical NER based on semantic model and improved SVM-KNN algorithm can enhance the quality of NER and we get the precision, recall rate and F value up to 86%.
机译:在医学领域,医学文献,技术文献和病历中存在许多以自然语言表达的非结构化信息。 IE(信息提取)是自然语言过程中最重要的研究方向之一,旨在帮助人们自动提取相关信息。 NER(命名实体识别)是IE的子系统之一,对IE的质量有直接影响。如今,由于医学领域知识的复杂性,医学领域的NER尚未达到理想的精度。很难在特征选择中确切地描述医学实体,并且当前选择的特征并不丰富且缺乏语义信息。除此之外,具有相似特征的不同医疗实体更容易导致分类算法做出错误的判断。诸如SVM-KNN的组合式多分类算法可以克服自身的缺点,并获得更高的性能。但是由于设置了不合适的K值和不均衡的示例分布,当前的SVM-KNN分类算法可能会提供错误的类别结果。在本文中,我们介绍了两级建模工具,以帮助专家构建语义模型并从中选择特征。我们设计和实现了基于UIMA框架的医学命名实体识别分析引擎,并在分类中采用了改进的SVM-KNN算法,称为EK-SVM-KNN(扩展K SVM-KNN)。我们从仁济医院的系统性红斑狼疮(SLE)临床信息系统中收集实验数据。我们采用50个病理报告作为训练数据,并提供1000个病理作为测试数据。实验表明,基于语义模型和改进的SVM-KNN算法的医学NER可以提高NER的质量,准确率,召回率和F值高达86%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号