...
首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Learning Context Using Segment-Level LSTM for Neural Sequence Labeling
【24h】

Learning Context Using Segment-Level LSTM for Neural Sequence Labeling

机译:使用段级LSTM进行神经序列标记的学习背景

获取原文
获取原文并翻译 | 示例
           

摘要

This article introduces an approach that learns segment-level context for sequence labeling in natural language processing (NLP). Previous approaches limit their basic unit to a word for feature extraction because sequence labeling is a token-level task in which labels are annotated word-by-word. However, the text segment is an ultimate unit for labeling, and we are easily able to obtain segment information from annotated labels in a IOB/IOBES format. Most neural sequence labeling models expand their learning capacity by employing additional layers, such as a character-level layer, or jointly training NLP tasks with common knowledge. The architecture of our model is based on the charLSTM-BiLSTM-CRF model, and we extend the model with an additional segment-level layer called segLSTM. We therefore suggest a sequence labeling algorithm called charLSTM-BiLSTM-CRF-segLSTM$^{sLM}$ which employs an additional segment-level long short-term memory (LSTM) that trains features by learning adjacent context in a segment. We demonstrate the performance of our model on four sequence labeling datasets, namely, Peen Tree Bank, CoNLL 2000, CoNLL 2003, and OntoNotes 5.0. Experimental results show that our model performs better than state-of-the-art variants of BiLSTM-CRF. In particular, the proposed model enhances the performance of tasks for finding appropriate labels of multiple token segments.
机译:本文介绍了一种方法,用于了解自然语言处理中的序列标记的段级上下文(NLP)。以前的方法将其基本单元限制为特征提取的单词,因为序列标记是标签逐字的标记级任务。但是,文本段是用于标记的最终单位,我们很容易获得IOB / IOBES格式中的注释标签的段信息。大多数神经序列标记模型通过采用其他层,例如字符级层,或具有共同知识的联合培训NLP任务来扩展其学习能力。我们的模型的体系结构基于Charlstm-Bilstm-CRF模型,我们将模型扩展了一个名为SEGLSTM的附加段级层。因此,我们建议一个名为Charlstm-Bilstm-CRF-SEGLSTM的序列标记算法<内联公式XMLNS:MML =“http://www.w3.org/1998/math/mathml”xmlns:xlink =“http://www.w3.org/1999/xlink”> $ ^ {SLM} $ 其中采用额外的段级长短短期存储器(LSTM),其通过在段中学习相邻上下文来列车。我们展示了我们在四个序列标签数据集中的模型的表现,即Peen Tree Bank,Conll 2000,Conll 2003和Ontonotes 5.0。实验结果表明,我们的模型比Bilstm-CRF的最先进变种更好。特别是,所提出的模型增强了用于查找多个令牌段的适当标签的任务的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号