首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Building Sentiment Lexicons for All Major Languages
【24h】

Building Sentiment Lexicons for All Major Languages

机译:为所有主要语言建立情感词典

获取原文

摘要

Sentiment analysis in a multilingual world remains a challenging problem, because developing language-specific sentiment lexicons is an extremely resource-intensive process. Such lexicons remain a scarce resource for most languages. In this paper, we address this lexicon gap by building high-quality sentiment lexicons for 136 major languages. We integrate a variety of linguistic resources to produce an immense knowledge graph. By appropriately propagating from seed words, we construct sentiment lexicons for each component language of our graph. Our lexicons have a polarity agreement of 95.7% with published lexicons, while achieving an overall coverage of 45.2%. We demonstrate the performance of our lexicons in an extrinsic analysis of 2,000 distinct historical figures' Wikipedia articles on 30 languages. Despite cultural difference and the intended neutrality of Wikipedia articles, our lexicons show an average sentiment correlation of 0.28 across all language pairs.
机译:在多语言世界中,情感分析仍然是一个具有挑战性的问题,因为开发特定于语言的情感词典是一个非常耗费资源的过程。对于大多数语言而言,此类词典仍然是稀缺资源。在本文中,我们通过为136种主要语言构建高质量的情感词典来解决此词典差距。我们整合了各种语言资源,以产生巨大的知识图。通过从种子词中适当传播,我们为图形的每种组成语言构造了情感词典。我们的词典与已发布的词典的极性一致性为95.7%,而总体覆盖率为45.2%。我们通过对2,000种不同的历史人物的Wikipedia文章对30种语言的外在分析,证明了词典的性能。尽管文化差异和Wikipedia文章的预期中立性,我们的词典显示所有语言对的平均情感相关性为0.28。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号