首页> 外文会议>International Conference on Systems and Informatics >An Automatic Interaction Detection Hybrid Model for Bankcard Response Classification
【24h】

An Automatic Interaction Detection Hybrid Model for Bankcard Response Classification

机译:银行卡响应分类的自动交互检测混合模型

获取原文

摘要

Data mining techniques have numerous applications in bankcard response modeling. Logistic regression has been used as the standard modeling tool in the financial industry because of its almost always desirable performance and its interpretability. In this paper, we propose a hybrid bankcard response model, which integrates decision tree based chi-square automatic interaction detection (CHAID) into logistic regression. In the first stage of the hybrid model, CHAID analysis is used to detect the possibly potential variable interactions. Then in the second stage, these potential interactions are served as the additional input variables in logistic regression. The motivation of the proposed hybrid model is that adding variable interactions may improve the performance of logistic regression. Theoretically, all possible interactions could be added in logistic regression and significant interactions could be identified by feature selection procedures. However, even the stepwise selection is very time-consuming when the number of independent variables is large and tends to cause the p n problem. On the other hand, using CHAID analysis for the detection of variable interactions has the potential to overcome the above-mentioned drawbacks. To demonstrate the effectiveness of the proposed hybrid model, it is evaluated on a real credit customer response data set. As the results reveal, by identifying potential interactions among independent variables, the proposed hybrid approach outperforms the logistic regression without searching for interactions in terms of classification accuracy, the area under the receiver operating characteristic curve (ROC), and Kolmogorov-Smirnov (KS) statistics. Furthermore, CHAID analysis for interaction detection is much more computationally efficient than the stepwise search mentioned above and some identified interactions are shown to have statistically significant predictive power on the target variable. Last but not least, the customer profile created based on the CHAID tree provides a reasonable interpretation of the interactions, which is the required by regulations of the credit industry. Hence, this study provides an alternative for handling bankcard classification tasks.
机译:数据挖掘技术在银行卡响应建模中具有众多应用。由于逻辑回归几乎总是理想的性能和可解释性,因此已被用作金融行业的标准建模工具。在本文中,我们提出了一种混合银行卡响应模型,该模型将基于决策树的卡方自动交互检测(CHAID)集成到逻辑回归中。在混合模型的第一阶段,使用CHAID分析来检测可能的潜在变量交互作用。然后在第二阶段,这些潜在的相互作用被用作逻辑回归中的附加输入变量。提出的混合模型的动机是添加变量交互可以改善逻辑回归的性能。从理论上讲,所有可能的交互作用都可以添加到逻辑回归中,并且可以通过特征选择过程来识别重要的交互作用。但是,当自变量的数量很大时,即使是逐步选择也非常耗时,并且容易引起p >> n问题。另一方面,将CHAID分析用于变量交互的检测具有克服上述缺陷的潜力。为了证明所提出的混合模型的有效性,在真实的信用客户响应数据集上对其进行了评估。结果表明,通过识别独立变量之间的潜在相互作用,所提出的混合方法优于逻辑回归,而无需在分类准确性,接收器工作特性曲线(ROC)和Kolmogorov-Smirnov(KS)下寻找相互作用统计数据。此外,用于交互检测的CHAID分析比上述逐步搜索具有更高的计算效率,并且某些已识别的交互显示出对目标变量具有统计学上显着的预测能力。最后但并非最不重要的一点是,基于CHAID树创建的客户资料提供了对交互的合理解释,这是信贷行业法规所要求的。因此,本研究为处理银行卡分类任务提供了一种替代方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号