首页> 外文会议>IEEE International Conference on Data Engineering >Web-Scale Blocking, Iterative and Progressive Entity Resolution
【24h】

Web-Scale Blocking, Iterative and Progressive Entity Resolution

机译:Web规模阻止,迭代和渐进式实体解析

获取原文

摘要

Entity resolution aims to identify descriptions of the same entity within or across knowledge bases. In this work, we provide a comprehensive and cohesive overview of the key research results in the area of entity resolution. We are interested in frameworks addressing the new challenges in entity resolution posed by the Web of data in which real world entities are described by interlinked data rather than documents. Since such descriptions are usually partial, overlapping and sometimes evolving, entity resolution emerges as a central problem both to increase dataset linking, but also to search the Web of data for entities and their relations. We focus on Web-scale blocking, iterative and progressive solutions for entity resolution. Specifically, to reduce the required number of comparisons, blocking is performed to place similar descriptions into blocks and executes comparisons to identify matches only between descriptions within the same block. To minimize the number of missed matches, an iterative entity resolution process can exploit any intermediate results of blocking and matching, discovering new candidate description pairs for resolution. Finally, we overview works on progressive entity resolution, which attempt to discover as many matches as possible given limited computing budget, by estimating the matching likelihood of yet unresolved descriptions, based on the matches found so far.
机译:实体解析旨在识别知识库中或知识库中同一实体的描述。在这项工作中,我们对实体解析领域的关键研究结果进行了全面而有凝聚力的概述。我们对解决由Web数据构成的实体解析新挑战的框架感兴趣,在该框架中,实际实体是通过互连的数据而不是文档来描述的。由于此类描述通常是局部的,重叠的,有时甚至是不断发展的,因此实体解析成为中心问题,不仅要增加数据集链接,而且要在数据网络中搜索实体及其关系。我们专注于Web规模的阻止,用于实体解析的迭代和渐进式解决方案。具体而言,为了减少所需的比较次数,执行块以将相似的描述放入块中,并执行比较以仅识别同一块内的描述之间的匹配。为了最大程度地减少错过的匹配数,迭代实体解析过程可以利用任何中间结果进行阻塞和匹配,从而发现新的候选描述对以进行解析。最后,我们概述了渐进式实体解析的工作,该渐进式实体解析基于到目前为止找到的匹配项,通过估计尚未解决的描述的匹配可能性,尝试在给定的有限计算预算的情况下发现尽可能多的匹配项。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号