首页> 外文会议>AAAI Conference on Artificial Intelligence >A Unified Model for Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media
【24h】

A Unified Model for Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media

机译:中国社交媒体中跨域和半监督名称实体识别的统一模型

获取原文

摘要

Named entity recognition (NER) in Chinese social media is important but difficult because of its informality and strong noise. Previous methods only focus on in-domain supervised learning which is limited by the rare annotated data. However, there are enough corpora in formal domains and massive in-domain unannotated texts which can be used to improve the task. We propose a unified model which can learn from out-of-domain corpora and in-domain unannotated texts. The unified model contains two major functions. One is for cross-domain learning and another for semi-supervised learning. Cross-domain learning function can learn out-of-domain information based on domain similarity. Semi-Supervised learning function can learn in-domain unannotated information by self-training. Both learning functions outperform existing methods for NER in Chinese social media. Finally, our unified model yields nearly 11% absolute improvement over previously published results.
机译:在中国社交媒体中命名的实体识别(NER)很重要,而且因为它的非正式性和强烈的噪音很难。以前的方法只关注域中的域中的监督学习,这些学习受到罕见的注释数据的限制。但是,在正式域中有足够的语料库,并且可以用来用于改善任务的巨大域Unarnotated文本。我们提出了一个统一的模型,可以从域外域外学习和域名未经讨犯的文本中学习。统一模型包含两个主要功能。一个是用于跨域学习,另一个用于半监督学习。跨域学习功能可以基于域相似度学习域外信息。半监督学习功能可以通过自我培训来学习域名未经讨论的信息。学习功能均始终以中国社交媒体中的NER突出现有方法。最后,我们的统一模型在先前公布的结果上产生了近11%的绝对改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号