【24h】

Detecting deception in Online Social Networks

机译:在在线社交网络中检测欺骗

获取原文

摘要

Over the past decade Online Social Networks (OSNs) have been helping hundreds of millions of people develop reliable computer-mediated relations. However, many user profiles in OSNs contain misleading, inconsistent or false information. Existing studies have shown that lying in OSNs is quite widespread, often for protecting a user's privacy. In order for OSNs to continue expanding their role as a communication medium in our society, it is crucial for information posted on OSNs to be trusted. Here we define a set of analysis methods for detecting deceptive information about user genders in Twitter. In addition, we report empirical results with our stratified data set consisting of 174,600 Twitter profiles with a 50-50 breakdown between male and female users. Our automated approach compares gender indicators obtained from different profile characteristics including first name, user name, and layout colors. We establish the overall accuracy of each indicator and the strength of all possible values for each indicator through extensive experimentations with our data set. We define male trending users and female trending users based on two factors, namely the overall accuracy of each characteristic and the relative strength of the value of each characteristic for a given user. We apply a Bayesian classifier to the weighted average of characteristics for each user. We flag for possible deception profiles that we classify as male or female in contrast with a self-declared gender that we obtain independently of Twitter profiles. Finally, we use manual inspections on a subset of profiles that we identify as potentially deceptive in order to verify the correctness of our predictions.
机译:在过去的十年中,在线社交网络(OSN)一直在帮助数以亿计的人建立可靠的计算机介导的关系。但是,OSN中的许多用户配置文件包含误导,不一致或错误的信息。现有研究表明,躺在OSN中非常普遍,通常是为了保护用户的隐私。为了使OSN继续扩大其在我们社会中作为通信媒介的作用,对OSN上发布的信息进行信任至关重要。在这里,我们定义了一组分析方法,用于检测Twitter中有关用户性别的欺骗性信息。此外,我们通过分层数据集报告实证结果,该分层数据集包含174,600个Twitter个人资料,其中男性和女性用户之间的细分比例为50-50。我们的自动化方法比较了从不同的个人资料特征(包括名字,用户名和布局颜色)获得的性别指标。通过对我们的数据集进行广泛的实验,我们可以确定每个指标的总体准确性以及每个指标所有可能值的强度​​。我们基于两个因素定义男性趋势用户和女性趋势用户,即每个特征的整体准确性和给定用户的每个特征值的相对强度。我们将贝叶斯分类器应用于每个用户特征的加权平均值。与我们独立于Twitter个人资料而获得的自我声明的性别相比,我们标记可能的欺骗个人资料,我们将其归类为男性或女性。最后,我们对识别为具有潜在欺骗性的配置文件子集进行手动检查,以验证我们预测的正确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号