首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Improving NER in Social Media via Entity Type-Compatible Unknown Word Substitution
【24h】

Improving NER in Social Media via Entity Type-Compatible Unknown Word Substitution

机译:通过实体兼容未知文字替换改进了社交媒体中的新手

获取原文
获取外文期刊封面目录资料

摘要

Named entity recognition (NER) is a fundamental task for information extraction (IE), and current state-of-the-art methods try to address this issue and achieve high performance on clean text (e.g., newswire genres). However, most of these algorithms do not generalize well when they transit to the noisy domain such as social media. To alleviate the noisy expression in social media data, we present a novel word substitution strategy based on constructing an entity type-compatible (ETC) semantic space. We substitute unknown words with the ETC words found by deep metric learning (DML) and nearest neighbor (NN) search. Comprehensive experiments show that the proposed framework achieves state-of-the-art performance on the W-NUT2017 dataset and the novel strategy brings good generality to multiple NER tools and previous works.
机译:命名实体识别(NER)是信息提取(即)的基本任务,以及当前的最先进的方法尝试解决此问题并在清洁文本(例如,新闻版流域)上实现高性能。 然而,当他们过境到社交媒体等嘈杂域时,大多数这些算法都不会概括。 为了缓解社交媒体数据中的嘈杂表达,我们提出了一种基于构建实体类型兼容(ETC)语义空间的新型词汇策略。 我们用深度度量学习(DML)和最近的邻居(NN)搜索发现的ETC单词替换未知单词。 综合实验表明,拟议的框架在W-Nut2017数据集中实现了最先进的性能,新颖的策略为多个工具和以前的作品带来了良好的普遍性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号