首页> 外文会议>IEEE International Conference on Consumer Electronics and Computer Engineering >Incorporating lexicon and character glyph and morphological features into BiLSTM-CRF for Chinese medical NER
【24h】

Incorporating lexicon and character glyph and morphological features into BiLSTM-CRF for Chinese medical NER

机译:将Lexicon和Character字形和形态特征掺入Bilstm-CRF的中医

获取原文

摘要

Chinese Medical Named Entity Recognition (CMNER) is the basic task of information processing and intelligent medical service in Chinese medical field. In order to make full use of the information of characters and words in the text to improve the effect of CMNER, this paper proposes a recognition model based on character-based BiLSTM-CRF, which integrates lexicon and character features. Firstly, in order to make full use of the information of words and word sequence in the text, the C-ExSoftword method is proposed to integrate the lexicon into the model. According to the similarity of Chinese character glyph and morphologically related forms in the field of Chinese medicine, this paper takes the four-corner coding as the character glyph features of Chinese characters, and extracts the morphological features of each word by using the improved Bidirectional Maximum Matching (BDMM) algorithm. Chinese character glyph features and morphological features are integrated into each character vector. Finally, experiments on real data sets show that the proposed model performs better than the character-based and word-based models.
机译:中国医疗名为实体识别(CMNER)是中国医学领域信息处理和智能医疗服务的基本任务。为了充分利用文本中的字符和单词信息来提高CMNER的效果,提出了一种基于基于角色的Bilstm-CRF的识别模型,它集成了词典和字符特征。首先,为了充分利用文本中的文字和单词序列的信息,建议将Lexicon集成到模型中的C-Exsoftword方法。根据中医领域的汉字雕文和形态相关形式的相似性,本文将四角编码作为汉字的字符形状形状特征,并通过使用改进的双向最大值提取每个单词的形态特征匹配(BDMM)算法。汉字群体特征和形态学特征集成到每个字符向量中。最后,真实数据集的实验表明,所提出的模型比基于字符和基于Word的模型更好地执行。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号