首页> 外文期刊>Knowledge-Based Systems >Imbalanced text sentiment classification using universal and domain-specific knowledge
【24h】

Imbalanced text sentiment classification using universal and domain-specific knowledge

机译:使用通用知识和特定领域知识的不平衡文本情感分类

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, a sentiment classification model is proposed to address two predominant issues in sentiment classification, namely domain-sensitive and data imbalance. Since words may embed distinct sentiment polarities in different contexts, sentiment classification is widely contended as a domain-sensitive task. Accordingly, this paper draws on label propagation to induce universal and domain-specific sentiment lexicons and builds a domain-adaptive sentiment classification model that incorporates universal and domain-specific knowledge into a unified learning framework. On the flip side, sentiment-related corpuses are usually formed with skewed polarity distribution because individuals tend to share similar assessment criteria on a given object and hence their sentiment polarities toward the same object are likely to be similar. We endeavor to address such imbalanced data problem by advancing a novel over-sampling technique. Unlike existing over-sampling approaches that generate minority-class samples from numerical feature space, the proposed sampling method directly creates synthetic texts from word spaces. Several experiments are conducted to verify the effectiveness of the proposed lexicon generation method, learning framework, and over-sampling method. Results show that the induced sentiment lexicons are interpretable and the proposed model is found to be effective for imbalanced and domain-specific text sentiment classification.
机译:本文提出了一种情感分类模型,以解决情感分类中的两个主要问题,即领域敏感和数据不平衡。由于单词可能在不同的上下文中嵌入不同的情感极性,因此情感分类被广泛认为是领域敏感任务。因此,本文利用标签传播来诱导通用和特定领域的情感词典,并建立将通用和特定领域的知识纳入统一学习框架的领域自适应情感分类模型。另一方面,与情感相关的语料库通常以偏斜的极性分布形成,因为个体倾向于在给定对象上共享相似的评估标准,因此他们对同一对象的情感极性可能相似。我们致力于通过提出一种新颖的过采样技术来解决这种不平衡的数据问题。与现有的从数字特征空间生成少数类样本的过采样方法不同,所提出的采样方法直接从词空间创建合成文本。进行了一些实验,以验证所提出的词典生成方法,学习框架和过采样方法的有效性。结果表明,诱导情感词典是可以解释的,并且所提出的模型对于不平衡和特定领域的文本情感分类是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号