首页> 外文会议>IEEE International Conference on Computer and Communications >Research on Sentiment Analysis of Microblogging Based on LSA and TF-IDF
【24h】

Research on Sentiment Analysis of Microblogging Based on LSA and TF-IDF

机译:基于LSA和TF-IDF的微博微博的情感分析研究

获取原文

摘要

As a typical social network application, the impact of microblogging on people has penetrated into all aspects, which attracts more and more scholars to carry out in-depth study on microblogging. The sentiment analysis of microblogging text is the hot studying field now. Feature selection and extraction is one of the core parts of microblogging text sentiment analysis, TF-IDF algorithm is the most widely used method in selecting features. Although the TF-IDF algorithm is simple to use, there is still a problem of semantic deletion on it, that is, it ignores the semantic information contained in the text. To solve the problem, LSA is introduced in this paper. Firstly, the eigenvectors generated by TF-IDF algorithm is decomposed by singular value. Then, calculating the cosine value between the row vectors of the decomposition results to identify the similarity between the words, which realizes the feature extraction and makes up for the deficiency of TF-IDF. Finally, the extracted features are applied into four classification algorithms to verify the effectiveness of the proposed method. The experimental results show that the introcuction of LSA can make improvements of microblogging text classification in accuracy, recall and F value.
机译:作为典型的社交网络应用,微博对人们的影响已经渗透到各个方面,吸引了越来越多的学者对微博进行深入研究。微博文本的情感分析是现在的热门学习领域。特征选择和提取是微博文本情感分析的核心部分之一,TF-IDF算法是选择功能中最广泛使用的方法。虽然使用TF-IDF算法易于使用,但仍然存在对其的语义删除问题,即,它忽略了文本中包含的语义信息。为了解决问题,在本文中介绍了LSA。首先,由TF-IDF算法产生的特征向量被奇异值分解。然后,计算分解结果的行向量之间的余弦值,以识别从而实现特征提取和弥补TF-IDF的缺陷的单词之间的相似性。最后,将提取的特征应用于四种分类算法以验证所提出的方法的有效性。实验结果表明,LSA的同步可以提高微博文本分类,准确,召回和F值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号