【24h】

Rule-Based Method for Entity Resolution

机译:基于规则的实体解析方法

获取原文
获取原文并翻译 | 示例

摘要

The objective of entity resolution (ER) is to identify records referring to the same real-world entity. Traditional ER approaches identify records based on pairwise similarity comparisons, which assumes that records referring to the same entity are more similar to each other than otherwise. However, this assumption does not always hold in practice and similarity comparisons do not work well when such assumption breaks. We propose a new class of rules which could describe the complex matching conditions between records and entities. Based on this class of rules, we present the rule-based entity resolution problem and develop an on-line approach for ER. In this framework, by applying rules to each record, we identify which entity the record refers to. Additionally, we propose an effective and efficient rule discovery algorithm. We experimentally evaluated our rule-based ER algorithm on real data sets. The experimental results show that both our rule discovery algorithm and rule-based ER algorithm can achieve high performance.
机译:实体解析(ER)的目的是识别引用同一真实世界实体的记录。传统的ER方法基于成对相似性比较来识别记录,这假设引用同一实体的记录彼此之间的相似度更高。但是,这种假设在实践中并不总是成立,并且当这种假设中断时,相似性比较不能很好地进行。我们提出了一类新规则,可以描述记录和实体之间的复杂匹配条件。基于此类规则,我们提出了基于规则的实体解析问题,并开发了一种用于ER的在线方法。在此框架中,通过对每个记录应用规则,我们可以确定记录所指的实体。此外,我们提出了一种有效且高效的规则发现算法。我们在真实数据集上实验性地评估了基于规则的ER算法。实验结果表明,我们的规则发现算法和基于规则的ER算法都可以实现高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号