首页> 外文期刊>Decision support systems >Identity matching and information acquisition: Estimation of optimal threshold parameters
【24h】

Identity matching and information acquisition: Estimation of optimal threshold parameters

机译:身份匹配和信息获取:最佳阈值参数的估计

获取原文
获取原文并翻译 | 示例
       

摘要

With the growing volume of collected and stored data from customer interactions that have recently shifted towards online channels, an important challenge faced by today's businesses is appropriately dealing with data quality problems. A key step in the data cleaning process is the matching and merging of customer records to assess the identity of individuals. The practical importance of this research is exemplified by a large client firm that deals with private label credit cards. They needed to know whether there existed histories of new customers within the company, in order to decide on the appropriate parameters of possible card offerings. The company incurs substantial costs if they incorrectly "match" an incoming application with an existing customer (Type Ⅰ error), and also if they falsely assume that there is no match (Type Ⅱ error). While there is a good deal of generic identity matching software available, that will provide a "strength" score for each potential match, the question of how to use the scores for new applications is of great interest and is addressed in this work. The academic significance lies in the analysis of the score thresholds that are typically used in decision making. That is, upper and lower thresholds are set, where matches are accepted above the former, rejected below the latter, and more information is gathered between the two. We show, for the first time, that the optimal thresholds can be considered to be parameters of a matching distribution, and a number of estimators of these parameters are developed and analyzed. Then extensive computations show the effects of various factors on the convergence rates of the estimates.
机译:随着最近来自客户交互的收集和存储数据量的增加,并且最近已转向在线渠道,当今企业面临的一个重要挑战是如何适当地处理数据质量问题。数据清理过程中的关键步骤是匹配和合并客户记录以评估个人身份。一家处理私人品牌信用卡的大型客户公司体现了这项研究的实际重要性。他们需要知道公司内是否有新客户的历史记录,以便确定可能的卡产品的适当参数。如果他们不正确地将“传入”应用程序与现有客户“匹配”(类型Ⅰ错误),并且错误地认为没有匹配项(类型Ⅱ错误),则公司将承担巨额成本。尽管有很多通用的身份匹配软件可供使用,它将为每个潜在的匹配提供一个“强度”分数,但是如何在新应用程序中使用分数的问题引起了人们的极大兴趣,并且在本工作中已得到解决。学术意义在于分析通常用于决策的分数阈值。也就是说,设置了上下阈值,其中在前者之上接受匹配,在后者之下拒绝匹配,并且在两者之间收集更多信息。我们首次展示了最佳阈值可以被视为匹配分布的参数,并且开发并分析了这些参数的多个估计量。然后大量的计算表明了各种因素对估计收敛速度的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号