首页> 外文会议>International conference on very large data bases >Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources
【24h】

Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources

机译:基于知识的信任:估算网源的可信度

获取原文

摘要

The quality of web sources has been traditionally evaluated using exogenous signals such as the hyperlink structure of the graph. We propose a new approach that relies on endogenous signals, namely, the correctness of factual information provided by the source. A source that has few false facts is considered to be trustworthy. The facts are automatically extracted from each source by information extraction methods commonly used to construct knowledge bases. We propose a way to distinguish errors made in the extraction process from factual errors in the web source per se, by using joint inference in a novel multi-layer probabilistic model. We call the trustworthiness score we computed Knowledge-Based Trust (KBT). On synthetic data, we show that our method can reliably compute the true trustworthiness levels of the sources. We then apply it to a database of 2.8B facts extracted from the web, and thereby estimate the trustworthiness of 119M webpages. Manual evaluation of a subset of the results confirms the effectiveness of the method.
机译:使用诸如图形的超链接结构的外源信号传统地评估了网源的质量。我们提出了一种依赖于内源信号的新方法,即来源提供的事实信息的正确性。一些虚假事实的来源被认为是值得信赖的。通过常用于构建知识库的信息提取方法自动从每个来源中提取事实。我们提出了一种方法来利用新颖的多层概率模型中的联合推断,从网源本身的事实误差中区分提取过程中的错误。我们称之为我们计算基于知识的信任(KBT)的可信度分数。在合成数据上,我们表明我们的方法可以可靠地计算源的真正可靠性水平。然后,我们将其应用于从网络中提取的2.8B事实的数据库,从而估计119M网页的可信度。手动评估结果的子集证实了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号