首页> 外文期刊>SIGKDD explorations >Multi-Source Deep Learning for Information Trustworthiness Estimation
【24h】

Multi-Source Deep Learning for Information Trustworthiness Estimation

机译:用于信息可信度估计的多源深度学习

获取原文
获取原文并翻译 | 示例
           

摘要

In recent years, information trustworthiness has become a serious issue when user-generated contents prevail in our information world. In this paper, we investigate the important problem of estimating information trustworthiness from the perspective of correlating and comparing multiple data sources. To a certain extent, the consistency degree is an indicator of information reliability - Information unanimously agreed by all the sources is more likely to be reliable. Based on this principle, we develop an effective computational approach to identify consistent information from multiple data sources. Particularly, we analyze vast amounts of information collected from multiple review platforms (multiple sources) in which people can rate and review the items they have purchased. The major challenge is that different platforms attract diverse sets of users, and thus information cannot be compared directly at the surface. However, latent reasons hidden in user ratings are mostly shared by multiple sources, and thus inconsistency about an item only appears when some source provides ratings deviating from the common latent reasons. Therefore, we propose a novel two-step procedure to calculate information consistency degrees for a set of items which are rated by multiple sets of users on different platforms. We first build a Multi-Source Deep Belief Network (MSDBN) to identify the common reasons hidden in multi-source rating data, and then calculate a consistency score for each item by comparing individual sources with the reconstructed data derived from the latent reasons. We con- duct experiments on real user ratings collected from Orbitz, Priceline and TripAdvisor on all the hotels in Las Vegas and New York City. Experimental results demonstrate that the proposed approach successfully finds the hotels that receive inconsistent, and possibly unreliable, ratings.
机译:近年来,当用户生成的内容在我们的信息世界中盛行时,信息可信赖性已成为一个严重的问题。在本文中,我们将从关联和比较多个数据源的角度研究估计信息可信度的重要问题。在某种程度上,一致性程度是信息可靠性的指标-所有来源一致同意的信息更可能可靠。基于此原理,我们开发了一种有效的计算方法,可以从多个数据源中识别出一致的信息。特别是,我们会分析从多个评论平台(多个来源)收集的大量信息,人们可以在其中评估和评论他们购买的商品。主要的挑战是不同的平台会吸引各种各样的用户,因此信息无法直接从表面进行比较。但是,隐藏在用户评分中的潜在原因通常由多个来源共享,因此,仅当某些来源提供的评分不同于常见潜在原因时,才出现项目的不一致。因此,我们提出了一种新颖的两步过程来计算一组项目的信息一致性程度,这些项目由不同平台上的多组用户评估。我们首先建立一个多源深度信任网络(MSDBN),以识别隐藏在多源评级数据中的常见原因,然后通过将各个源与从潜在原因中得出的重构数据进行比较,为每个项目计算一致性评分。我们对从拉斯维加斯和纽约市所有酒店的Orbitz,Priceline和TripAdvisor收集的真实用户评分进行了实验。实验结果表明,所提出的方法成功地找到了评级不一致且可能不可靠的酒店。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号