An Empirical Study of Automatic Chinese Word Segmentation for Spoken Language Understanding and Named Entity Recognition

机译：对语言理解的自动汉字分割的实证研究和命名实体识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Word segmentation is usually recognized as the first step for many Chinese natural language processing tasks, yet its impact on these subsequent tasks is relatively under-studied. For example, how to solve the mismatch problem when applying an existing word seg-menter to new data? Does a better word seg-menter yield a better subsequent NLP task performance? In this work, we conduct an initial attempt to answer these questions on two related subsequent tasks: semantic slot filling in spoken language understanding and named entity recognition. We propose three techniques to solve the mismatch problem: using word segmentation outputs as additional features, adaptation with partial-learning and taking advantage of n-best word segmentation list. Experimental results demonstrate the effectiveness of these techniques for both tasks and we achieve an error reduction of about 11% for spoken language understanding and 24% for named entity recognition over the baseline systems.

机译：字分割通常被认为是许多中国自然语言处理任务的第一步，但它对这些后续任务的影响相对研究。例如，如何在将现有的单词SEG-CENTER应用于新数据时解决不匹配问题？ SEG-MENTER是否会产生更好的后续NLP任务性能？在这项工作中，我们对两个相关后续任务进行了初步尝试回答这些问题：语义插槽填充口语理解和命名实体识别。我们提出了三种解决不匹配问题的技术：使用Word Segmentation输出作为其他功能，适应部分学习和利用N最佳单词分段列表。实验结果表明，这些技术对于两个任务的有效性，并且我们在基线系统上实现了对语言理解的误差约为11％，并且在基线系统上指定实体识别的24％。

著录项

来源
《Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2016年|lviii 777 p.|共11页
会议地点
作者
Wencan Luo; Fan Yang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Chinese word segmentation and named entity recognition: A pragmatic approach [J] . Gao JF, Li M, Wu A, Computational linguistics . 2005,第4期

机译：中文分词与命名实体识别：一种务实的方法
2. Chinese word segmentation and named entity recognition: A pragmatic approach [J] . Gao JF, Li M, Wu A, Computational linguistics . 2005,第4期

机译：中文分词与命名实体识别：一种务实的方法
3. Universal attribute characterization of spoken languages for automatic spoken language recognition [J] . Sabato Marco Siniscalchi, Jeremy Reed, Torbjorn Svendsen, Computer speech and language . 2013,第1期

机译：口语的通用属性表征，用于自动口语识别
4. An Empirical Study of Automatic Chinese Word Segmentation for Spoken Language Understanding and Named Entity Recognition [C] . Wencan Luo, Fan Yang Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2016

机译：汉语自动分词对口语理解和命名实体识别的实证研究
5. An Application of Natural Language Processing: Named Entity Recognition with BLSTM in Chinese Corpora [D] . Mao, Lihui 2019

机译：自然语言处理的应用：BLSTM在中文语料库中的命名实体识别
6. Joint segmentation and named entity recognition using dual decomposition in Chinese discharge summaries [O] . Yan Xu, Yining Wang, Tianren Liu, 2014

机译：中文放电摘要中使用双重分解的联合分割和命名实体识别
7. An Empirical Study of Automatic Chinese Word Segmentation for Spoken Language Understanding and Named Entity Recognition [O] . Wencan Luo, Fan Yang 2016

机译：对语言理解的自动汉字分割的实证研究和命名实体识别

An Empirical Study of Automatic Chinese Word Segmentation for Spoken Language Understanding and Named Entity Recognition

摘要

著录项

相似文献

相关主题

期刊订阅