首页> 外文会议>International Conference on Distributed Computing Systems >D-Swoosh: A Family of Algorithms for Generic, Distributed Entity Resolution
【24h】

D-Swoosh: A Family of Algorithms for Generic, Distributed Entity Resolution

机译:D-Swoosh:一系列通用,分布式实体分辨率的算法

获取原文

摘要

Entity Resolution (ER) matches and merges records that refer to the same real-world entities, and is typically a compute-intensive process due to complex matching functions and high data volumes. We present a family of algorithms, D-Swoosh, for distributing the ER workload across multiple processors. The algorithms use generic match and merge functions, and ensure that new merged records are distributed to processors that may have matching records. We perform a detailed performance evaluation on a testbed of 15 processors. Our experiments use actual comparison shopping data provided by Yahoo!.
机译:实体分辨率(ER)匹配和合并引用相同实体实体的记录,并且通常是由于复杂匹配功能和高数据卷导致的计算密集型过程。我们展示了一系列算法D-Swoosh,用于在多个处理器上分配ER工作负载。算法使用通用匹配和合并功能,并确保将新合并的记录分发给可能具有匹配记录的处理器。我们对15个处理器的测试平台进行详细的性能评估。我们的实验使用雅虎提供的实际比较购物数据!

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号