首页> 外文会议>International Conference on Artificial Intelligence in HCI;International Conference on Human-Computer Interaction >What Emotions Make One or Five Stars?Understanding Ratings of Online Product Reviews by Sentiment Analysis and XAI
【24h】

What Emotions Make One or Five Stars?Understanding Ratings of Online Product Reviews by Sentiment Analysis and XAI

机译:情感使一颗或五颗星成为现实?通过情感分析和XAI了解在线产品评论的评级

获取原文

摘要

When people buy products online, they primarily base their decisions on the recommendations of others given in online reviews. The current work analyzed these online reviews by sentiment analysis and used the extracted sentiments as features to predict the product ratings by several machine learning algorithms. These predictions were disentangled by various methods of explainable AI (XAI) to understand whether the model showed any bias during prediction. Study 1 benchmarked these algorithms (knn, support vector machines, random forests, gradient boosting machines, XGBoost) and identified random forests and XGBoost as best algorithms for predicting the product ratings. In Study 2, the analysis of global feature importance identified the sentiment joy and the emotional valence negative as most predictive features. Two XAI visualization methods, local feature attributions and partial dependency plots, revealed several incorrect prediction mechanisms on the instance-level. Performing the benchmarking as classification, Study 3 identified a high no-information rate of 64.4% that indicated high class imbalance as underlying reason for the identified problems. In conclusion, good performance by machine learning algorithms must be taken with caution because the dataset, as encountered in this work, could be biased towards certain predictions. This work demonstrates how XAI methods reveal such prediction bias.
机译:人们在线购买产品时,他们的决策主要基于在线评论中其他人的建议。当前的工作是通过情感分析来分析这些在线评论,并使用提取的情感作为特征,通过几种机器学习算法来预测产品等级。通过各种可解释的AI(XAI)方法解开了这些预测,以了解模型在预测期间是否显示出任何偏差。研究1对这些算法(knn,支持向量机,随机森林,梯度提升机,XGBoost)进行了基准测试,并将随机森林和XGBoost确定为预测产品等级的最佳算法。在研究2中,对整体特征重要性的分析将情绪愉悦和情绪价负定为最可预测的特征。两种XAI可视化方法(局部特征归因和部分依赖图)揭示了实例级别的几种错误的预测机制。通过对基准进行分类,研究3确定了64.4%的高无信息率,这表明高级别的失衡是所发现问题的根本原因。总之,必须谨慎使用机器学习算法的良好性能,因为在此工作中遇到的数据集可能会偏向某些预测。这项工作演示了XAI方法如何揭示这种预测偏差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号