首页> 外文会议>International Conference on Sustainable Information Engineering and Technology >Sentiment Analysis on Movie Reviews Using Ensemble Features and Pearson Correlation Based Feature Selection
【24h】

Sentiment Analysis on Movie Reviews Using Ensemble Features and Pearson Correlation Based Feature Selection

机译:电影审查的情感分析使用集合功能和基于Pearson相关的特征选择

获取原文

摘要

Microblogging has become the media information that is very popular among internet users. Therefore, the microblogging became a source of rich data for opinions and reviews especially on movie reviews. We proposed, sentiment analysis on movie review using ensemble features and Bag of Words and selection Features Pearson's Correlation to reduce the dimension of the feature and get the optimal feature combinations. Use the feature selection is done to improve the performance of the classification, reducing the dimension of the feature and get the optimal feature combinations. The process of classification using several models of Na?ve Bayes i.e. Bernoulli Na?ve Bayes for binary data, Gaussian Na?ve Bayes for continuous data and Multinomial Na?ve Bayes for numeric data. The results of this study indicate that by using the non-standard word on tweet evaluation results obtained accuracy 82%, precision 86%, recall 79.62% and f-measure 82.69% using Feature Selection 20%. Then after using manual standardization of word the evaluation results on the accuracy increased by 8% and then the accuracy becomes 90%, precision 92%, recall 88.46% and f-measure 90.19% using 85% feature selection. Based on these results it can be concluded that by using the standardization of word can improve the performance of classification and feature selection Pearson's provide optimal feature combinations and reducing the total number of dimensions' feature.
机译:微博已成为互联网用户中非常受欢迎的媒体信息。因此,微博成为了尤其是电影评论的意见和评论的丰富数据来源。我们提出了使用集合功能和单词和选择的电影审查的情感分析,具有Pearson的相关性,以减少功能的维度并获得最佳特征组合。使用该功能选择来提高分类的性能,减少功能的维度并获得最佳特征组合。使用几种型号的Na ve贝雷斯的分类过程中的贝尔努利Na?ve贝尔斯用于二进制数据,高斯Na've贝雷斯用于连续数据和多项式Na?Ve贝叶斯进行数字数据。本研究结果表明,通过使用非标准词关于推文评估结果获得的精度82%,精度86%,召回79.62%和F-Peace 82.69%使用特征选择20%。然后在使用Word的手动标准化后,评估结果提高了8%,然后精度变为90%,精度92%,召回88.46%,使用85%特征选择90.19%。基于这些结果,可以得出结论,通过使用单词的标准化可以提高分类的性能,特征选择Pearson提供最佳特征组合并减少维度的总数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号