首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >SERIMI: Class-Based Matching for Instance Matching Across Heterogeneous Datasets
【24h】

SERIMI: Class-Based Matching for Instance Matching Across Heterogeneous Datasets

机译:SERIMI:跨异构数据集的实例匹配的基于类的匹配

获取原文
获取原文并翻译 | 示例

摘要

State-of-the-art instance matching approaches do not perform well when used for matching instances . This shortcoming derives from their core operation depending on , which involves a direct comparison of instances in the source with instances in the target dataset. Direct matching is not suitable when the overlap between the datasets is small. Aiming at resolving this problem, we propose a new paradigm called . Given a class of instances from the source dataset, called the , and a set of candidate matches retrieved from the target, class-based matching refines the candidates by filtering out those that do not belong to the class of interest. For this refinement, only data in the target is used, i.e., no direct comparison between source and target is involved. Based on extensive experiments using public benchmarks, we show our approach greatly improves the quality of state-of-the-art systems; especially on difficult matching tasks.
机译:最先进的实例匹配方法在用于匹配实例时效果不佳。此缺点源自于其依赖于的核心操作,该操作涉及直接比较源中的实例与目标数据集中的实例。当数据集之间的重叠很小时,直接匹配不适合。为了解决这个问题,我们提出了一个名为的新范式。给定源数据集中的一类实例(称为),并从目标中检索出一组候选匹配项,基于类的匹配项将通过筛选出不属于目标类的候选项来精炼候选项。为了进行这种改进,仅使用目标中的数据,即,不涉及源和目标之间的直接比较。基于使用公共基准进行的大量实验,我们证明了我们的方法大大提高了最新系统的质量;特别是在困难的匹配任务上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号