【24h】

CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition

机译:CAN-NER:卷积注意力网络用于中文命名实体识别

获取原文

摘要

Named entity recognition (NER) is a common task in Natural Language Processing (NLP). but it remains more challenging in Chinese because of its lack of natural delimiters. Therefore, Chinese Word Segmentation (CWS) is usually necessary as the first step for Chinese NER. However, models based on word-level embeddings and lexicon features often suffer from segmentation errors and out-of-vocabulary (OOV) problems. In this paper, we investigate a Convolutional Attention Network (CAN) for Chinese NER, which consists of a character-based convolutional neural network (CNN) with local-attention layer and a gated recurrent unit (GRU) with global self-attention layer to capture the information from adjacent characters and sentence contexts. Moreover, differently from other approaches, CAN-NER does not depend on any external resources like lexicons and employing small-size char em-beddings makes CAN-NER more practical for real systems scenarios. Extensive experimental results show that our approach outperforms state-of-the-art methods without word embedding and external lexicon resources on different domains datasets.
机译:命名实体识别(NER)是自然语言处理(NLP)中的常见任务。但由于缺乏自然定界符,它在中文中仍然更具挑战性。因此,中文分词(CWS)通常是中文NER的第一步。但是,基于单词级嵌入和词典特征的模型通常会遇到分段错误和语音不足(OOV)问题。在本文中,我们研究了针对中文NER的卷积注意力网络(CAN),该网络由具有局部注意力层的基于字符的卷积神经网络(CNN)和具有全局自我注意层的门控递归单元(GRU)组成。从相邻的字符和句子上下文中捕获信息。此外,与其他方法不同,CAN-NER不依赖任何外部资源(例如词典),采用小型字符嵌入使CAN-NER在实际系统场景中更加实用。大量的实验结果表明,在不同领域的数据集上,无需词嵌入和外部词典资源,我们的方法的性能优于最新方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号