首页> 外文会议>International Conference on Web Research >A Supervised Framework for Review Spam Detection in the Persian Language
【24h】

A Supervised Framework for Review Spam Detection in the Persian Language

机译:波斯语垃圾邮件检测的监督框架

获取原文

摘要

Sentiment analysis of online reviews has attracted an increasing attention from both academia and industry. Although online reviews are valuable sources of information for detecting public opinion towards different aspects of products, they may be written by spammers with different purposes. In order to detect such spam reviews, several methods have been proposed for English language but no study has been reported on Persian spam detection so far. In the current study, Persian reviews of cell-phones are investigated to find spam type 1 and type 2 which are fake reviews and reviews only written about brands, respectively. In the proposed framework a labeled dataset, SpamPer, is first created using a majority voting on the answers of 11 questions previously designed for spam detection by human annotators. Then several preprocessing steps for Persian language are performed to refine the training data. Finally review-based and metadata features are extracted. The obtained results on 3000 reviews of SpamPer shows that the highest accuracy is obtained using the decision tree with 0.78 F1-measure. Moreover, the results reveal that SVM for unbalanced data and decision tree for balanced data achieve better performance when they are trained on the combination of metadata and review-based features.
机译:在线评论的情感分析已引起学术界和行业的越来越多的关注。尽管在线评论是检测公众对产品不同方面的舆论的有价值的信息来源,但垃圾评论发送者可能出于不同的目的而撰写了这些评论。为了检测到此类垃圾邮件评论,已经提出了几种针对英语的方法,但迄今为止,尚未有关于波斯垃圾邮件检测的研究报告。在当前的研究中,对波斯语的手机评论进行了调查,以找到1类和2类垃圾邮件,它们分别是假评论和仅关于品牌撰写的评论。在提出的框架中,首先使用多数投票权对先前设计用于人工注释者检测垃圾邮件的11个问题的答案进行投票,从而创建标记数据集SpamPer。然后执行波斯语的几个预处理步骤以完善训练数据。最后,提取基于审阅和元数据的功能。在SpamPer的3000条评论中获得的结果表明,使用具有0.78 F1-measure的决策树可获得最高的准确性。此外,结果表明,当对元数据和基于审阅的功能进行组合训练时,用于不平衡数据的SVM和用于平衡数据的决策树可以获得更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号