首页> 外文会议>International Conference on Information Fusion >A Bayesian idealization of entity resolution
【24h】

A Bayesian idealization of entity resolution

机译:贝叶斯实体解析的理想化

获取原文

摘要

Network theory has progressed a long way since the Erdös-Rényi model, identifying many important real-world phenomena that a good random graph model should capture, and producing more realistic models to capture many of them. However, these models are largely limited to the domain of simple networks - nodes and links only - leaving remaining complications outside the realm of theory. In such cases, a practitioner with complicated data is left to make decisions or apply algorithms to compensate for these issues without the benefit of an underlying model. In this paper, we develop a simple generative model of the entity resolution problem. Noting its similarity to the association problem in data fusion, we develop principled inference equations for entity resolution analogous to those developed for data association. The framework for this effort is a ground-truth model for object states and for the network which links them, together with a Dirichlet process model for how the observed aliases of the objects are distributed among the observed transactions between them. The paper focuses on the derivation of the inference equations, and the result is demonstrated on an illustrative example. Because the framework is based on rigorous probabilistic models, it is particularly well suited to ambiguous scenarios in which no single entity resolution hypothesis is stands out as the correct one.
机译:自Erdös-Rényi模型以来,网络理论已经取得了长足的进步,它发现了许多重要的现实世界现象,一个好的随机图模型应该捕获这些现象,并产生了更现实的模型来捕获其中的许多现象。但是,这些模型在很大程度上限于简单网络的范围(仅节点和链接),从而使剩余的复杂性超出了理论范围。在这种情况下,如果没有基础模型的好处,则需要由具有复杂数据的从业人员来做出决策或应用算法来弥补这些问题。在本文中,我们开发了一个简单的实体解析问题生成模型。注意到它与数据融合中的关联问题相似,我们为实体分辨率开发了原理上的推理方程,类似于为数据关联而开发的方程。这项工作的框架是用于对象状态和链接它们的网络的基础模型,以及用于观察对象别名如何在对象之间观察事务之间分配的Dirichlet流程模型。本文着重于推导方程的推导,并在一个说明性例子中证明了结果。因为该框架基于严格的概率模型,所以它特别适合于模棱两可的场景,在这些场景中,没有任何单个实体解析假设是正确的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号