首页> 外文学位 >Towards an understanding of information crediblity on online social networks.
【24h】

Towards an understanding of information crediblity on online social networks.

机译:旨在了解在线社交网络上的信息可信度。

获取原文
获取原文并翻译 | 示例

摘要

The increased adoption of online social networks such as Twitter has led to a deluge of available information. This brings about the need for methods to quickly identify and extract useful, credible information from large amounts of noisy data. We first show the challenges in defining credibility in the case of information in social media. Then, we develop supervised machine learning methods to extract credible information. We also define reasonable and meaningful credibility ground truth measures. To accomplish this, we deconstruct credibility and study the specific constructs that signal credibility individually. We then conduct a crowdsourced survey to collect ground truth credibility assessments. We find that surveys yield measurements that are often noisy and hard to work with. On Twitter, retweets are a form of endorsement by the users on Twitter and are a noisy in-network measure of credibility. We show that combining these measures yields ground truth measures where both sets of users agree on the credibility of a message. We find that models trained on these labeling schemes are able to identify more useful messages and achieve higher accuracy over models trained to predict the individual noisy ground truth values.;A related task is that of identifying what pieces of information published on the social network are true. One approach to solve this problem treats humans on the social network as sensors with unknown reliability who sense the state of the world and report their observations as claims by publishing messages. Fact finding algorithms use an unsupervised estimation theoretic approach to jointly estimate the truthfulness of claims and the reliability of the human sensors that make the claims given some prior beliefs. However, due to the sparseness of information available in Twitter streaming data, these algorithms have very little information to update the prior beliefs for claims corroborated by very few sources. We find that using simple heuristics in developing fusion methods to use the credibility predictions yields improvements in performance over the estimates reached by the fact finder alone.
机译:Twitter等在线社交网络的广泛采用导致大量可用信息。这就需要快速识别和从大量嘈杂数据中提取有用,可靠信息的方法。我们首先展示在社交媒体信息中定义可信度方面的挑战。然后,我们开发有监督的机器学习方法以提取可靠的信息。我们还定义了合理和有意义的可信度真实性度量。为了实现这一目标,我们对可信度进行了解构,并研究了单独发出可信度信号的具体结构。然后,我们进行众包调查,以收集地面真实可信度评估。我们发现,调查得出的测量结果通常很嘈杂且难以使用。在Twitter上,转发是用户在Twitter上认可的一种形式,并且是嘈杂的网络内信誉度量。我们表明,将这些措施结合起来可以产生地面真实性措施,两组用户都对消息的可信度达成共识。我们发现,通过这些标记方案训练的模型能够识别更多有用的消息,并且比经过训练以预测单个嘈杂的地面真实值的模型具有更高的准确性。;相关的任务是识别在社交网络上发布的哪些信息是真正。解决该问题的一种方法将社交网络上的人类视为可靠性未知的传感器,它们感知世界状况并通过发布消息将其观察结果报告为声明。事实发现算法使用一种无​​监督的估计理论方法来联合估计索赔的真实性和使传感器具有一定先验信念的人类传感器的可靠性。但是,由于Twitter流数据中可用信息的稀疏性,这些算法几乎没有信息来更新对由极少数来源证实的主张的先前观点。我们发现,在开发融合方法时使用简单的启发式方法来使用可信度预测会比仅事实发现者获得的估计值产生性能上的提高。

著录项

  • 作者

    Sikdar, Sujoy Kumar.;

  • 作者单位

    Rensselaer Polytechnic Institute.;

  • 授予单位 Rensselaer Polytechnic Institute.;
  • 学科 Computer science.
  • 学位 M.S.
  • 年度 2015
  • 页码 74 p.
  • 总页数 74
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号