首页> 外文会议>International conference on information and knowledge engineering >Data Warehouse Integration Using Best Fit Matching
【24h】

Data Warehouse Integration Using Best Fit Matching

机译:数据仓库集成使用最佳配合匹配

获取原文

摘要

Much of the research related to information retrieval focuses on finding as many relevant pieces of data about a topic as possible. With respect to identity matching, this approach would find as many possible variants of a search name as possible. This works fine for some applications, but there are also times when the requirement is for a best fit match. In this case, we expect that a single name has zero or one matches in another data set. Many systems accomplish this by use of a hard key, such as driver's license number or social security number. This paper presents SQL based soft matching to get a best fit match. This is valuable when one or more of the data sets being compared has data quality problems or lacks complete and reliable hard keys.
机译:与信息检索有关的大部分研究侧重于找到尽可能多的关于主题的相关数据。关于身份匹配,这种方法会发现尽可能多的搜索名称的可能变体。这适用于某些应用程序,但需要时需要符合最佳匹配。在这种情况下,我们希望单个名称在另一个数据集中具有零或匹配项。许多系统通过使用硬密钥来完成此操作,例如驾驶执照号码或社会安全号码。本文介绍了基于SQL的软匹配,以获得最佳匹配。当被比较的一个或多个数据集具有数据质量问题或缺少完整且可靠的硬键时,这是有价值的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号