首页> 外文期刊>Expert systems with applications >A novel approach to define the local region of dynamic selection techniques in imbalanced credit scoring problems
【24h】

A novel approach to define the local region of dynamic selection techniques in imbalanced credit scoring problems

机译:一种新颖的方法来定义不平衡信用评分问题中动态选择技术的局域

获取原文
获取原文并翻译 | 示例

摘要

Lenders, such as banks and credit card companies, use credit scoring models to evaluate the potential risk posed by lending money to customers, and therefore to mitigate losses due to bad credit. The profitability of the banks thus highly depends on the models used to decide on the customer's loans. State-of-the-art credit scoring models are based on machine learning and statistical methods. One of the major problems of this field is that lenders often deal with imbalanced datasets that usually contain many paid loans but very few not paid ones (called defaults). Recently, dynamic selection methods combined with ensemble methods and preprocessing techniques have been evaluated to improve classification models in imbalanced datasets presenting advantages over the static machine learning methods. In a dynamic selection technique, samples in the neighborhood of each query sample are used to compute the local competence of each base classifier. Then, the technique selects only competent classifiers to predict the query sample. In this paper, we evaluate the suitability of dynamic selection techniques for credit scoring problem, and we present Reduced Minority k-Nearest Neighbors (RMkNN), an approach that enhances state of the art in defining the local region of dynamic selection techniques for imbalanced credit scoring datasets. This proposed technique has a superior prediction performance in imbalanced credit scoring datasets compared to state of the art. Furthermore, RMkNN does not need any preprocessing or sampling method to generate the dynamic selection dataset (called DSEL). Additionally, we observe an equivalence between dynamic selection and static selection classification. We conduct a comprehensive evaluation of the proposed technique against state-of-the-art competitors on six real-world public datasets and one private one. Experiments show that RMkNN improves the classification performance of the evaluated datasets regarding AUC, balanced accuracy, H-measure, G-mean, F-measure, and Recall. (C) 2020 Elsevier Ltd. All rights reserved.
机译:贷款人,如银行和信用卡公司,使用信用评分模型来评估向客户贷款带来的潜在风险,从而减轻由于信贷不良而减轻亏损。因此,银行的盈利能力高度取决于用于决定客户贷款的模型。最先进的信用评分模型是基于机器学习和统计方法。该领域的主要问题之一是贷款人经常处理不平衡数据集,通常包含许多付费贷款,但少数未支付(称为默认值)。最近,已经评估了与集合方法和预处理技术相结合的动态选择方法,以改善在静态机器学习方法上呈现优势的不平衡数据集中的分类模型。在动态选择技术中,每个查询样本的邻域中的样本用于计算每个基本分类器的本地能力。然后,该技术仅选择有能力的分类器来预测查询样本。在本文中,我们评估了信用评分问题的动态选择技术的适用性,我们提高了少数k-最近邻居(RMKNN),这是一种提高领域的方法,在定义不平衡信用的动态选择技术的局域评分数据集。与现有技术相比,这种提出的技术在不平衡的信用评分数据集中具有优越的预测性能。此外,RMKN不需要任何预处理或采样方法来生成动态选择数据集(称为DSEL)。此外,我们遵守动态选择与静态选择分类之间的等价。我们对六个现实世界公共数据集和一个私有的竞争对手进行了全面评估了拟议的竞争对手。实验表明,RMKNN提高了评估数据集的分类性能,了解AUC,平衡准确性,H措施,G均值,F测量和召回。 (c)2020 elestvier有限公司保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号