首页> 外文会议>International Conference on Computing, Mathematics and Statistics >SMOTE Approach to Imbalanced Dataset in Logistic Regression Analysis
【24h】

SMOTE Approach to Imbalanced Dataset in Logistic Regression Analysis

机译:在逻辑回归分析中阐明了不平衡数据集的方法

获取原文

摘要

Logistic regression is a classification model that is commonly used in bankruptcy studies. The classifier works well when data is balanced. However, imbalanced data set is found in almost all bankruptcy studies. The most common approach to deal with imbalanced data set is by selecting and matching the samples from both bankrupt and non-bankrupt samples. The problem of imbalanced data and the approach taken to deal with it can affect a good predictive model. The objective of the study is to improve the classification accuracy of a logit model when data is heavily loaded to one side. The approach taken is by using SMOTE sampling. The study used SMEs categorized under the accommodation and food service activities, and the hotel sector. There are 14 explanatory variables involved. The result from this study confirmed that the AUC and sensitivity values from SMOTE Logistic Regression (SLR) model is higher than the AUC and sensitivity values of a logit model.
机译:Logistic回归是一个常用于破产研究的分类模型。当数据平衡时,分类器运行良好。但是,在几乎所有破产研究中都发现了不平衡的数据集。处理不平衡数据集的最常见方法是通过选择和匹配来自破产和非破产样本的样本。数据不平衡的问题和对处理它的方法可能会影响一个良好的预测模型。该研究的目的是当数据大量加载到一侧时,提高Logit模型的分类准确性。采取的方法是通过使用Smote采样。研究使用中小企业分类为住宿和食品服务活动,以及酒店部门。有14个解释性变量涉及。本研究的结果证实,来自Smote Logistic回归(SLR)模型的AUC和敏感值高于Logit模型的AUC和灵敏度值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号