首页> 外文期刊>Knowledge-Based Systems >Weakly supervised topic sentiment joint model with word embeddings
【24h】

Weakly supervised topic sentiment joint model with word embeddings

机译:带词嵌入的弱监督主题情感联合模型

获取原文
获取原文并翻译 | 示例

摘要

Topic sentiment joint model aims to deal with the problem about the mixture of topics and sentiment simultaneously from online reviews. Most of existing topic sentiment modeling algorithms are mainly based on the state-of-art latent Dirichlet allocation (LDA) and probabilistic latent semantic analysis (PLSA), which infer sentiment and topic distributions from the co-occurrence of words. These methods have been proposed and successfully used for topic and sentiment analysis. However, when the training corpus is small or when the documents are short, the textual features become sparse, so that the results of the sentiment and topic distributions might be not very satisfied. In this paper, we propose a novel topic sentiment joint model called weakly supervised topic sentiment joint model with word embeddings (WS-TSWE), which incorporates word embeddings and HowNet lexicon simultaneously to improve the topic identification and sentiment recognition. The main contributions of WS-TSWE include the following two aspects. (1) Existing models generate the words only from the sentiment-topic-to-word Dirichlet multinomial component, but the WS-TSWE model replaces it with a mixture of two components, a Dirichlet multinomial component and a word embeddings component. Since the word embeddings are trained on a very large corpora and can be used to extend the semantic information of the words, they can provide a certain solution for the problem of the textual sparse. (2) Most of previous models incorporate sentiment knowledge in the beta priors. And the priors are usually set from a dictionary and completely rely on previous domain knowledge to identify positive and negative words. In contrast, the WS-TSWE model calculates the sentiment orientation of each word with the HowNet lexicon and automatically infers sentiment-based beta priors for sentiment analysis and opinion mining. Furthermore, we implement WS-TSWE with Gibbs sampling algorithms. The experimental results on Chinese and English data sets show that WS-TSWE achieved significant performance in the task of detecting sentiment and topics simultaneously. (c) 2018 Elsevier B.V. All rights reserved.
机译:主题情感联合模型旨在同时处理在线评论中主题和情感混合的问题。现有的大多数主题情感建模算法主要基于最新的潜在Dirichlet分配(LDA)和概率潜在语义分析(PLSA),它们可以根据单词的共现来推断情感和主题分布。这些方法已被提出并成功用于主题和情感分析。但是,当训练语料库较小或文档较短时,文本特征会变得稀疏,因此可能无法很好地满足情绪和主题分布的结果。在本文中,我们提出了一种新颖的主题情感联合模型,即带有词嵌入的弱监督主题情感联合模型(WS-TSWE),该模型同时结合了词嵌入和HowNet词典,以改善主题识别和情感识别。 WS-TSWE的主要贡献包括以下两个方面。 (1)现有模型仅从情感主题到单词Dirichlet多项式组件生成单词,但是WS-TSWE模型将其替换为两个组件的混合,即Dirichlet多项式组件和单词嵌入组件。由于词嵌入是在非常大的语料库上训练的,可以用来扩展词的语义信息,因此它们可以为文本稀疏问题提供一定的解决方案。 (2)大多数以前的模型在Beta先验中都包含了情感知识。而且先验通常是从字典中设置的,并且完全依靠先前的领域知识来识别肯定和否定词。相反,WS-TSWE模型使用HowNet词典计算每个单词的情感取向,并自动推断基于情感的Beta先验,以进行情感分析和观点挖掘。此外,我们使用Gibbs采样算法实现WS-TSWE。在中文和英文数据集上的实验结果表明,WS-TSWE在同时检测情感和主题的任务上取得了显着的性能。 (c)2018 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号