首页> 中文期刊>计算机科学 >基于表情图片与情感词的中文微博情感分析

基于表情图片与情感词的中文微博情感分析

     

摘要

微博是Web 2.0时代新生的社会化媒体平台,网民通过微博抒发自己的情感,表达自己的喜怒哀乐与爱恶,从而产生了海量的情感文本信息.通过对情感信息的分析,可以得到网民的情绪状况、对某个社会现象的观点、某个产品的喜好等信息,其不仅有一定的商业价值,还对社会的稳定有所帮助.利用微博中的表情图片,并结合情感词语的方法来构建中文微博情感语料库,既保证了语料库的规模与准确性,又省去了人工的负担;在情感语料库的基础上,构建贝叶斯分类器;最后利用熵的概念对语料库进行优化,提高了分类的准确性,并比较了使用不同n-gram特征项的性能.最终发现,使用UniGram特征项并用熵进行优化之后,分类的效果最好,召回率和准确率都可以达到85%以上,F值甚至可以达到89%以上.%Micro-blog is a new social media platform based on Web 2. 0. Internet users express their feelings,emotions, favorites and disgust through micro-blogs,resulting in a large number of emotional text information. We can know the emotional state of the Internet users, the point of a social phenomenon and preference of a product, through analysis of the emotional text information, which not only has a certain kind of commercial value, and is helpful to the stability of the society. In this paper, we use the emoticons form micro-blogs,combined with emotional words to build the Chinese emotional corpus,ensuring the scale and accuracy of the corpus,eliminating the need for artificial burdea Based on the corpus,we construct Bayes classifier and use the entropy to improve the performance. We compare different performance while changing the type of n-gram. Finally,we get the best classification results using unigrams as features and optimizing with entropy. Recall rate and accuracy can be achieved above 85%,the F measure can even reach more than 89%.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号