首页> 外文会议>International Conference on Bioinformatics Models, Methods and Algorithms >Towards a Unified Named Entity Recognition System Disease Mention Identification
【24h】

Towards a Unified Named Entity Recognition System Disease Mention Identification

机译:朝着统一的命名实体识别系统疾病提及识别

获取原文

摘要

Named Entity Recognition (NER) is an essential prerequisite task before effective text mining can begin for biomedical text data. Exploiting unlabeled text data to leverage system performance has been an active and challenging research topic in text mining due to the recent growth in the amount of biomedical literature. In this study, we take a step towards a unified NER system in biomedical, chemical and medical domain. We evaluate word representation features automatically learnt by a large unlabeled corpus for disease NER. The word representation features include brown cluster labels and Word Vector Classes (WVC) built by applying k-means clustering to continuous valued word vectors of Neural Language Model (NLM). The experimental evaluation using Arizona Disease Corpus (AZDC) showed that these word representation features boost system performance significantly as a manually tuned domain dictionary does. BANNERCHEMDNER, a chemical and biomedical NER system has been extended with a disease mention recognition model that achieves a 77.84% F-measure on AZDC when evaluating with 10-fold cross validation method. BANNER-CHEMDNER is freely available at: https://bitbucket.org/tsendeemts/banner-chemdner.
机译:命名实体识别(ner)是有效文本挖掘之前的必要先决条件任务,可以开始生物医学文本数据。利用未标记的文本数据,以利用系统性能是由于近期生物医学文献数量的增长,这是一个积极和具有挑战性的研究课题。在这项研究中,我们对生物医学,化学和医学领域的统一内部系统迈出了一步。我们评估由大型未标记的疾病内部的语料库自动学习的词。单词表示功能包括通过将K-means群集应用于神经语言模型(NLM)的连续值字向量构建的棕色群集标签和单词矢量类(WVC)。使用亚利桑那州疾病语料库(AZDC)的实验评估表明,随着手动调整的域字典,这些字表示功能显着提高系统性能。 Bannerchemdner,一种化学和生物医学NER系统已经延伸,疾病提及识别模型,在评估10倍交叉验证方法时,在AZDC上实现77.84%的F测量。 Banner-Chemdner自由地提供:https://bitbucket.org/tsendeemts/banner-chemdner。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号