首页> 外文会议>China national conference on computational linguistics;International symposium on natural language processing based on naturally annotated big data >Recognizing Biomedical Named Entities Based on the Sentence Vector/Twin Word Embeddings Conditioned Bidirectional LSTM
【24h】

Recognizing Biomedical Named Entities Based on the Sentence Vector/Twin Word Embeddings Conditioned Bidirectional LSTM

机译:基于条件向量双向LSTM的句子向量/双词嵌入识别生物医学命名实体

获取原文

摘要

As a fundamental step in biomedical information extraction tasks, biomedical named entity recognition remains challenging. In recent years, the neural network has been applied on the entity recognition to avoid the complex hand-designed features, which are derived from various linguistic analyses. However, performance of the conventional neural network systems is always limited to exploiting long range dependencies in sentences. In this paper, we mainly adopt the bidirectional recurrent neural network with LSTM unit to identify biomedical entities, in which the twin word embeddings and sentence vector are added to rich input information. Therefore, the complex feature extraction can be skipped. In the testing phase, Viterbi algorithm is also used to filter the illogical label sequences. The experimental results conducted on the BioCreative Ⅱ GM corpus show that our system can achieve an F-score of 88.61 %, which outperforms CRF models using the complex hand-designed features and is 6.74 % higher than RNNs.
机译:作为生物医学信息提取任务的基本步骤,生物医学命名实体识别仍然具有挑战性。近年来,神经网络已经应用于实体识别以避免复杂的手工设计功能,这些功能来自各种语言分析。然而,传统的神经网络系统的性能总是限于利用句子中的长距离依赖性。在本文中,我们主要通过LSTM单元采用双向反复性神经网络来识别生物医学实体,其中将双词嵌入和句子向量添加到丰富的输入信息。因此,可以跳过复杂的特征提取。在测试阶段,Viterbi算法还用于过滤不合逻辑的标签序列。对生物重建ⅡM血管语料库进行的实验结果表明,我们的系统可以达到88.61%的F分,这优于CRF模型,使用复杂的手动设计的功能,比RNN高6.74%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号