首页> 外文期刊>Multimedia Tools and Applications >Semi-automatic construction of a named entity dictionary for entity-based sentiment analysis in social media
【24h】

Semi-automatic construction of a named entity dictionary for entity-based sentiment analysis in social media

机译:半自动构建命名实体字典,用于社交媒体中基于实体的情感分析

获取原文
获取原文并翻译 | 示例
       

摘要

To understand the user experience in social media or to facilitate the design of human-centric services by social media, users' opinions about specific entities in text messages should be captured. A fine-grained named entity recognizer (NER) is an essential module for identifying opinion targets in text messages, and a named-entity (NE) dictionary is a major resource that affects the performance of an NER. However, it is not easy to construct an NE dictionary manually, because human annotation is time-consuming and labor-intensive. To reduce construction time and labor, we propose a semi-automatic system to construct an NE dictionary from the free online resource, Wikipedia. The proposed system constructs a pseudo-document for each Wikipedia NE by using an active-learning technique. It then classifies Wikipedia entries into NE classes based on similarities between the entries and pseudo-documents located in a vector space. In experiments, the proposed system classified 92.3 % of Wikipedia entries into 29 NE classes. It showed a high performance, with a macro-averaging F1-measure of 0.872 and micro-averaging F1-measure of 0.935.
机译:为了了解社交媒体中的用户体验或通过社交媒体促进以人为本的服务设计,应捕获用户对文本消息中特定实体的意见。细粒度的命名实体识别器(NER)是用于识别文本消息中的意见目标的基本模块,而命名实体(NE)词典是影响NER性能的主要资源。但是,人工构建NE词典并不容易,因为人工注释既费时又费力。为了减少施工时间和劳力,我们提出了一种半自动系统,可以从免费的在线资源Wikipedia构建NE词典。所提出的系统通过使用主动学习技术为每个Wikipedia NE构造一个伪文档。然后,根据条目和位于矢量空间中的伪文档之间的相似性,将Wikipedia条目分类为NE类。在实验中,提出的系统将92.3%的Wikipedia条目分类为29个NE类。它表现出很高的性能,宏平均F1测度为0.872,微观平均F1测度为0.935。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号