首页> 美国卫生研究院文献>SpringerPlus >Comparing writing style feature-based classification methods for estimating user reputations in social media
【2h】

Comparing writing style feature-based classification methods for estimating user reputations in social media

机译:比较基于写作风格特征的分类方法以估计社交媒体中的用户声誉

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In recent years, the anonymous nature of the Internet has made it difficult to detect manipulated user reputations in social media, as well as to ensure the qualities of users and their posts. To deal with this, this study designs and examines an automatic approach that adopts writing style features to estimate user reputations in social media. Under varying ways of defining Good and Bad classes of user reputations based on the collected data, it evaluates the classification performance of the state-of-art methods: four writing style features, i.e. lexical, syntactic, structural, and content-specific, and eight classification techniques, i.e. four base learners—C4.5, Neural Network (NN), Support Vector Machine (SVM), and Naïve Bayes (NB)—and four Random Subspace (RS) ensemble methods based on the four base learners. When South Korea’s Web forum, Daum Agora, was selected as a test bed, the experimental results show that the configuration of the full feature set containing content-specific features and RS-SVM combining RS and SVM gives the best accuracy for classification if the test bed poster reputations are segmented strictly into Good and Bad classes by portfolio approach. Pairwise t tests on accuracy confirm two expectations coming from the literature reviews: first, the feature set adding content-specific features outperform the others; second, ensemble learning methods are more viable than base learners. Moreover, among the four ways on defining the classes of user reputations, i.e. like, dislike, sum, and portfolio, the results show that the portfolio approach gives the highest accuracy.
机译:近年来,Internet的匿名性质使其难以检测社交媒体中受操纵的用户声誉,并且难以确保用户及其帖子的质量。为了解决这个问题,本研究设计并检验了一种自动方法,该方法采用写作风格功能来估计社交媒体中的用户声誉。根据收集的数据以各种方式定义用户信誉的好坏类别,它评估了最新方法的分类性能:四种写作风格特征,即词汇,句法,结构和内容特定,以及八种分类技术,即四个基本学习器(C4.5,神经网络(NN),支持向量机(SVM)和朴素贝叶斯(NB))以及基于四个基本学习器的四个随机子空间(RS)集成方法。当韩国的网络论坛Daum Agora被选为测试平台时,实验结果表明,包含特定内容功能的完整功能集的配置以及结合了RS和SVM的RS-SVM,如果进行测试,则分类的准确性最高。床海报的声誉通过投资组合方法严格分为好和坏两类。准确性的成对t检验证实了来自文献综述的两个期望:第一,添加内容特定功能的功能集优于其他功能。第二,整体学习方法比基础学习者更可行。此外,在定义用户信誉等级的四种方法中,即喜欢,不喜欢,总和和投资组合,结果表明,投资组合方法具有最高的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号