【24h】

Chinese Sentiment Orientation Analysis

机译:汉语情感倾向分析

获取原文

摘要

In this paper, we present one new method to analyze and classify the sentiment orientation of merchandise comments into three categories: neutral, positive and negative. Nowadays, many methods can be used to achieve this goal, however, we find that those methods may work well in dividing the polarity sentences into positive and negative but may not have a good result on neutral sentences, so a divide and conquer strategy is applied to firstly classify the texts into two parts as neutral and polarity texts. Then the polarity texts are divided into positive part and negative part. In the first step, TSVM tool is used to achieve the neutrality and polarity classification, but the training data used in our work is very special, which contains many polarity sentences but very few neutral sentences, so the strategy is adopted to divide the polarity data into several small parts, and each part polarity data is combined with all neutral data as training data, by this way several TSVM classifiers can be obtained, and by voting scheme the final result can be gained. In the second step, we propose an algorithm to achieve positive and negative classification. Firstly, a method is designed to re-evaluate each sentiment word and divide the dictionary into two parts based on confidence, which can reduce the negative impact of low-confidence words. Then an orientation analysis and classification algorithm is proposed to classify the polarity sentences step by step. Meanwhile, a set of rules is also built to classify those sentences which contain sentiment words that appear not in our sentiment dictionary.
机译:在本文中,我们提出了一种将商品评论的情感倾向性分析和分类为三类的新方法:中立,正面和负面。如今,可以使用许多方法来实现此目标,但是,我们发现这些方法在将极性语句分为正负两方面可能效果很好,但在中性语句上可能效果不佳,因此采用了分而治之的策略首先将文本分为中性和极性文本两部分。然后将极性文本分为正极部分和负极部分。第一步,使用TSVM工具实现中性和极性分类,但是我们的工作中使用的训练数据非常特殊,包含很多极性语句但很少有中性语句,因此采用了对极性数据进行划分的策略分为几个小部分,每个部分的极性数据与所有中性数据组合为训练数据,通过这种方式可以获得多个TSVM分类器,并通过投票方案可以获得最终结果。在第二步中,我们提出了一种实现正负分类的算法。首先,设计了一种方法来重新评估每个情感词,并根据置信度将字典分为两部分,这可以减少低置信度词的负面影响。然后提出了一种方向分析和分类算法,对极性语句进行了逐步分类。同时,还建立了一套规则来对那些包含未出现在我们的情感词典中的情感词的句子进行分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号