首页> 外文会议>Annual meeting of the Association for Computational Linguistics >The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis
【24h】

The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis

机译:有和无:利用未标记的语料库进行情感分析

获取原文

摘要

Expensive feature engineering based on WordNet senses has been shown to be useful for document level sentiment classification. A plausible reason for such a performance improvement is the reduction in data sparsity. However, such a reduction could be achieved with a lesser effort through the means of syntagma based word clustering. In this paper, the problem of data sparsity in sentiment analysis, both monolingual and cross-lingual, is addressed through the means of clustering. Experiments show that cluster based data sparsity reduction leads to performance better than sense based classification for sentiment analysis at document level. Similar idea is applied to Cross Lingual Sentiment Analysis (CLSA), and it is shown that reduction in data sparsity (after translation or bilingual-mapping) produces accuracy higher than Machine Translation based CLSA and sense based CLSA.
机译:事实证明,基于WordNet感官的昂贵功能工程对于文档级情感分类很有用。这种性能改进的一个合理原因是数据稀疏性的降低。但是,这种减少可以通过基于语体的词聚类的方式以较少的努力来实现。本文通过聚类的方法解决了情感分析中数据稀疏的问题,无论是单语言还是跨语言的。实验表明,与基于感觉的分类相比,基于聚类的数据稀疏性在文档级别的情感分析中具有更好的性能。类似的想法被应用于跨语言情感分析(CLSA),并且表明减少数据稀疏性(在翻译或双语映射之后)产生的准确性高于基于机器翻译的CLSA和基于感觉的CLSA。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号