首页> 外文会议>Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies >An Empirical Study of Automatic Chinese Word Segmentation for Spoken Language Understanding and Named Entity Recognition
【24h】

An Empirical Study of Automatic Chinese Word Segmentation for Spoken Language Understanding and Named Entity Recognition

机译:对语言理解的自动汉字分割的实证研究和命名实体识别

获取原文

摘要

Word segmentation is usually recognized as the first step for many Chinese natural language processing tasks, yet its impact on these subsequent tasks is relatively under-studied. For example, how to solve the mismatch problem when applying an existing word seg-menter to new data? Does a better word seg-menter yield a better subsequent NLP task performance? In this work, we conduct an initial attempt to answer these questions on two related subsequent tasks: semantic slot filling in spoken language understanding and named entity recognition. We propose three techniques to solve the mismatch problem: using word segmentation outputs as additional features, adaptation with partial-learning and taking advantage of n-best word segmentation list. Experimental results demonstrate the effectiveness of these techniques for both tasks and we achieve an error reduction of about 11% for spoken language understanding and 24% for named entity recognition over the baseline systems.
机译:字分割通常被认为是许多中国自然语言处理任务的第一步,但它对这些后续任务的影响相对研究。例如,如何在将现有的单词SEG-CENTER应用于新数据时解决不匹配问题? SEG-MENTER是否会产生更好的后续NLP任务性能?在这项工作中,我们对两个相关后续任务进行了初步尝试回答这些问题:语义插槽填充口语理解和命名实体识别。我们提出了三种解决不匹配问题的技术:使用Word Segmentation输出作为其他功能,适应部分学习和利用N最佳单词分段列表。实验结果表明,这些技术对于两个任务的有效性,并且我们在基线系统上实现了对语言理解的误差约为11%,并且在基线系统上指定实体识别的24%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号