首页> 外文期刊>Pattern recognition letters >Recognizing irregular entities in biomedical text via deep neural networks
【24h】

Recognizing irregular entities in biomedical text via deep neural networks

机译:通过深度神经网络识别生物医学文本中的不规则实体

获取原文
获取原文并翻译 | 示例

摘要

Named entity recognition (NER) is an important task for biomedical text mining. Most prior work focused on recognizing regular entities that consist of continuous word sequences and are not overlapped with each other. In this paper, we propose a neural network model called Bi-LSTM-CRF that consists of bidirectional (Bi) long short-term memories (LSTMs) and conditional random fields (CRFs) to identify regular entities and the components of irregular entities. Then the components are combined to build final irregular entities according to manually designed rules. Furthermore, we propose a novel model called NerOne that consists of the Bi-LSTM-CRF network and another Bi-LSTM network. The Bi-LSTM-CRF network performs the same task as the aforementioned model, and the Bi-LSTM network determines whether two components should be combined. Therefore, NerOne automatically combines the components instead of using manually designed rules. We evaluate our models on two datasets for recognizing regular and irregular biomedical entities. Experimental results show that, with less feature engineering, the performances of our models are comparable with those of state-of-the-art systems. We show that the method of automatically combining the components is as effective as the method of manually designing rules. Our work can facilitate the research on biomedical text mining. (c) 2017 Published by Elsevier B.V.
机译:命名实体识别(NER)是生物医学文本挖掘的重要任务。先前的大多数工作都集中在识别由连续单词序列组成且彼此不重叠的规则实体。在本文中,我们提出了一种称为Bi-LSTM-CRF的神经网络模型,该模型由双向(Bi)长短期记忆(LSTM)和条件随机场(CRF)组成,以识别规则实体和不规则实体的组成部分。然后根据人工设计的规则将这些组件组合起来以构建最终的不规则实体。此外,我们提出了一个名为NerOne的新型模型,该模型由Bi-LSTM-CRF网络和另一个Bi-LSTM网络组成。 Bi-LSTM-CRF网络执行与上述模型相同的任务,Bi-LSTM网络确定是否应合并两个组件。因此,NerOne会自动组合组件,而不是使用手动设计的规则。我们在两个数据集上评估我们的模型,以识别常规和非常规生物医学实体。实验结果表明,通过较少的特征工程,我们模型的性能可与最新系统相媲美。我们表明,自动组合组件的方法与手动设计规则的方法一样有效。我们的工作可以促进生物医学文本挖掘的研究。 (c)2017年由Elsevier B.V.

著录项

  • 来源
    《Pattern recognition letters》 |2018年第1期|105-113|共9页
  • 作者单位

    Wuhan Univ, Sch Comp, Wuhan, Hubei, Peoples R China;

    Heilongjiang Univ, Sch Comp Sci & Technol, Harbin, Heilongjiang, Peoples R China;

    Harvard Med Sch, Massachusetts Eye & Ear Infirm, Boston, MA USA;

    Hubei Univ Art & Sci, Dept Chinese Language & Literature, Xiangyang, Peoples R China;

    Heilongjiang Univ, Sch Comp Sci & Technol, Harbin, Heilongjiang, Peoples R China;

    Wuhan Univ, Sch Comp, Wuhan, Hubei, Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Biomedical entity recognition; Irregular entity; LSTM; CRF;

    机译:生物医学实体识别;不规则实体;LSTM;CRF;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号