首页> 外文会议>International Conference on Advanced Science and Engineering >Twitter Sentiment Analysis using an Ensemble Weighted Majority Vote Classifier
【24h】

Twitter Sentiment Analysis using an Ensemble Weighted Majority Vote Classifier

机译:推特情绪分析使用集合加权多数投票分类器

获取原文

摘要

Sentiment analysis extracts the emotions expressed in text and has been employed in many fields including politics, elections, movies, retail businesses and in recent years microblogs to understand, track and control the human sentiments or reactions toward products events or ideas. Nevertheless challenges such as different styles of writing, use of negation and sarcasm, existence of spelling mistakes, invention of new words etc. provide obstacle in the correct classification of sentiments. This paper provides an ensemble of classifiers framework for sentiment analysis. The proposed weighted majority voting ensemble method combines six models including Naïve Bayes, Logistic Regression, Stochastic Gradient Descent, Random Forest, Decision Tree and Support Vector Machine to form a single classifier. Weights of the individual classifiers of the ensemble are chosen as accuracy or Fl-score by optimizing their performance. This approach combines models based on the simple majority voting as opposed to the one based on weighted majority voting. Additionally, a comparison is drawn among these six individual classifiers to evaluate their performance. The proposed ensemble model is tested on some existing sentiment datasets, including SemEval 2017 Task 4A, 4B and 4C. The results demonstrate that the Logistic Regression classifier is optimal as compared to other individual classifiers. Furthermore, the proposed ensemble weighted majority voting classifier with the six individual classifiers performs better compared to the simple majority voting and all independent classifiers.
机译:情绪分析提取文本中表达的情绪,并已在许多领域中雇用,包括政治,选举,电影,零售业务,近年来微博,以了解,跟踪和控制对产品事件或想法的人类情感或反应。然而,诸如不同风格的挑战,否定的否定和讽刺,拼写错误的存在,新单词等的发明等提供了正确分类的障碍。本文提供了一个用于情意分析的分类器框架的集合。提出的加权多数投票合并方法结合了六种模型,包括天鹅湾,物流回归,随机梯度下降,随机林,决策树和支持向量机来形成单个分类器。通过优化其性能,选择集合的各个分类器的重量作为精度或飞行。这种方法基于基于简单的多数投票来结合模型,而不是基于加权大多数投票的模型。另外,在这六个单独的分类器中绘制了比较,以评估它们的性能。所提出的集合模型在某些现有的情绪数据集上进行了测试,包括Semeval 2017任务4a,4b和4c。结果表明,与其他单独的分类器相比,Logistic回归分类器是最佳的。此外,与六个单独的分类器相比,所提出的集合加权多数投票分类器比较简单的大多数投票和所有独立分类器更好地执行更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号