首页> 外文会议>IEEE Conference on Business Informatics >ELM: An Extended Logic Matching Method on Record Linkage Analysis of Disparate Databases for Profiling Data Mining
【24h】

ELM: An Extended Logic Matching Method on Record Linkage Analysis of Disparate Databases for Profiling Data Mining

机译:榆树:关于分离数据挖掘的不同数据库的记录链接分析的扩展逻辑匹配方法

获取原文

摘要

As predictive marketing and customer profiling solutions have become more sophisticated, they have increasingly become dependent on data from external sources. In order to utilize this data, records must be linked to internal records without the use of unique identifiers. The Extendable Logic for Matching (ELM) performs probabilistic matching from disparate sources and classifies matches according to discrete values reflective of their utility. Sets of matching rules are evaluated based on their performance on supervised classification tasks. High performance on a classification task is indicative of congruity with the real-world entity concerned, giving a sense of matching quality without the use of a gold standard. A set of matching rules generated using name and address was compared to a set which was matched using exact string comparison. We conclude that exact string comparison is a superior method for matching on highly sparse demographic data from disparate sources.
机译:随着预测的营销和客户分析解决方案变得更加复杂,它们越来越多地取决于外部来源的数据。为了利用此数据,必须在不使用唯一标识符的情况下将记录链接到内部记录。匹配(ELM)的可扩展逻辑执行从不同源的概率匹配,并根据其实用程序反映的离散值进行分类匹配。基于对监督分类任务的性能来评估匹配规则集。在分类任务上的高性能表明与有关现实实体的一致性,在不使用黄金标准的情况下给出匹配质量的感觉。将使用名称和地址生成的一组匹配规则与使用精确字符串比较匹配的集合进行了比较。我们得出结论,精确的字符串比较是一种卓越的方法,用于匹配来自不同源的高度稀疏的人口统计数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号