【24h】

A Vertex Matcher for Entity Resolution on Graphs

机译:用于图上实体解析的顶点匹配器

获取原文

摘要

Entity resolution (ER) is widely studied and a well-defined problem, often used for data management. ER identify and merge the redundant mentions of the same entity across multiple datasets. ER has been addressed with a variety of supervised, unsupervised, and probabilistic approaches to maintain data quality and reliability. ER is crucial and yet very useful for resolving records that refer to the same real-world entity. Traditionally, pairwise comparisons have been used for matching entities across databases, which is a computation-intensive task. Also, these approaches usually do not consider the structural similarity between the records during the comparison. To address these challenges, we proposed a vertex matcher (vMatcher), a graph-based approach that effectively represents the structural similarity between the entities and match entities only in their neighborhood that significantly improve the performance and efficiency. Also, we will learn the threshold value for matching similarity by a subset of training data in a supervised manner.
机译:实体解析(ER)被广泛研究并且是一个定义明确的问题,通常用于数据管理。 ER可以识别并合并跨多个数据集的同一实体的多余提及。 ER已通过多种有监督,无监督和概率性方法来解决,以保持数据质量和可靠性。 ER非常重要,但对于解析引用同一真实世界实体的记录非常有用。传统上,成对比较已用于跨数据库匹配实体,这是一项计算量大的任务。同样,这些方法通常在比较期间不考虑记录之间的结构相似性。为了解决这些挑战,我们提出了一种顶点匹配器(vMatcher),这是一种基于图的方法,可以有效地表示实体与匹配实体之间的结构相似性,而仅在其邻域中可以显着提高性能和效率。同样,我们将以监督方式通过训练数据的子集学习用于匹配相似性的阈值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号