The Method of Medical Named Entity Recognition Based on Semantic Model and Improved SVM-KNN Algorithm

机译：基于语义模型和改进的SVM-KNN算法的医学命名实体识别方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the medical field, a lot of unstructured information which is expressed by natural language exists in medical literature, technical documentation and medical records. IE (Information Extraction) as one of the most important research directions in natural language process aims to help humans extract concerned information automatically. NER (Named Entity Recognition) is one of the subsystems of IE and has direct influence on the quality of IE. Nowadays NER of medical field has not reached ideal precision largely due to the knowledge complexity in medical field. It is hard to describe the medical entity definitely in feature selection and current selected features are not rich and lack of semantic information. Besides that, different medical entities which have the similar characters more easily cause classification algorithm making wrong judgment. Combination multi classification algorithms such as SVM-KNN can overcome its own disadvantages and get higher performance. But current SVM-KNN classification algorithm may provide wrong categories results due to setting inappropriate K value and unbalanced examples distribution. In this paper, we introduce two-level modelling tool to help specialists to build semantic models and select features from them. We design and implement medical named entity recognition analysis engine based on UIMA framework and adopt improved SVM-KNN algorithm called EK-SVM-KNN (Extending K SVM-KNN) in classification. We collect experiment data from SLE(Systemic Lupus Erythematosus) clinical information system in Renji Hospital. We adopt 50 Pathology reports as training data and provide 1000 Pathology as test data. Experiment shows medical NER based on semantic model and improved SVM-KNN algorithm can enhance the quality of NER and we get the precision, recall rate and F value up to 86%.

机译：在医学领域，医学文献，技术文献和病历中存在许多以自然语言表达的非结构化信息。 IE（信息提取）是自然语言过程中最重要的研究方向之一，旨在帮助人们自动提取相关信息。 NER（命名实体识别）是IE的子系统之一，对IE的质量有直接影响。如今，由于医学领域知识的复杂性，医学领域的NER尚未达到理想的精度。很难在特征选择中确切地描述医学实体，并且当前选择的特征并不丰富且缺乏语义信息。除此之外，具有相似特征的不同医疗实体更容易导致分类算法做出错误的判断。诸如SVM-KNN的组合式多分类算法可以克服自身的缺点，并获得更高的性能。但是由于设置了不合适的K值和不均衡的示例分布，当前的SVM-KNN分类算法可能会提供错误的类别结果。在本文中，我们介绍了两级建模工具，以帮助专家构建语义模型并从中选择特征。我们设计和实现了基于UIMA框架的医学命名实体识别分析引擎，并在分类中采用了改进的SVM-KNN算法，称为EK-SVM-KNN（扩展K SVM-KNN）。我们从仁济医院的系统性红斑狼疮（SLE）临床信息系统中收集实验数据。我们采用50个病理报告作为训练数据，并提供1000个病理作为测试数据。实验表明，基于语义模型和改进的SVM-KNN算法的医学NER可以提高NER的质量，准确率，召回率和F值高达86％。

著录项

来源
《Seventh International Conference on Semantics, Knowledge, and Grids》|2011年|p.21-27|共7页
会议地点
作者
Han Xia; Ruonan Rao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计;
关键词
NER; SVM-KNN; Semantic Model; UIMA;

机译：NER; SVM-KNN;语义模型; UIMA;

相似文献

外文文献
中文文献
专利

1. Balanced undersampling: a novel sentence-based undersampling method to improve recognition of named entities in chemical and biomedical text [J] . Akkasi Abbas, Varoglu Ekrem, Dimililer Nazife Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2018,第8期

机译：平衡的欠采样：一种基于句子的欠采样方法，提高化学和生物医学文本中指定实体的识别
2. Biomedical named entity recognition based on recurrent neural networks with different extended methods [J] . Song Dingxin, Li Lishuang, Jin Liuke, International journal of data mining and bioinformatics . 2016,第1期

机译：基于递归神经网络的不同扩展方法的生物医学命名实体识别
3. Chinese electronic medical record named entity recognition algorithm based on transfer learning [J] . Li Yi, Liu Jianyi, Zhang Ru Basic & clinical pharmacology & toxicology. . 2020,第S9期

机译：基于转移学习的中国电子医疗记录名为实体识别算法
4. The Method of Medical Named Entity Recognition Based on Semantic Model and Improved SVM-KNN Algorithm [C] . Han Xia, Ruonan Rao International Conference on Semantics, Knowledge and Grid . 2011

机译：基于语义模型的医学命名实体识别方法及改进的SVM-KNN算法
5. Advancing Biomedical Named Entity Recognition with Multivariate Feature Selection and Semantically Motivated Features. [D] . Leaman, James Robert, Jr. 2013

机译：具有多元特征选择和语义动机特征的生物医学命名实体识别。
6. SBLC: a hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields [O] . Kai Xu, Zhanfan Zhou, Tao Gong, 2018

机译：SBLC：基于语义双向LSTM和条件随机场的疾病命名实体识别混合模型
7. Medical Named Entity Recognition from Un-labelled Medical Records based on Pre-trained Language Models and Domain Dictionary [O] . Chaojie Wen, Tao Chen, Xudong Jia, 2021

机译：医疗名为实体识别来自未标记的医疗记录，基于预先训练的语言模型和域字典

The Method of Medical Named Entity Recognition Based on Semantic Model and Improved SVM-KNN Algorithm

摘要

著录项

相似文献

相关主题

期刊订阅