首页> 外文期刊>Artificial Intelligence Research >Analysis of imbalanced data set problem: The case of churn prediction for telecommunication
【24h】

Analysis of imbalanced data set problem: The case of churn prediction for telecommunication

机译:不平衡数据集问题的分析:电信搅拌预测的情况

获取原文
获取原文并翻译 | 示例
           

摘要

Class-imbalanced datasets are common in the field of mobile Internet industry. We tested three kinds of feature selection techniques-Random Forest (RF), Relative Weight (RW) and Standardized Regression Coefficients (SRC); three kinds of balance methods-over-sampling (OS), under-sampling (US) and synthetic minority over-sampling (SMOTE); a widely used classification method-RF. The combined models are composed of feature selection techniques, balancing techniques and classification method. The original dataset which has 45 thousand records and 22 features were used to evaluate the performances of both feature selection and balancing techniques. The experimental results revealed that SRC combined with SMOTE technique attained the minimum value of Cost = 1085. Through the calculation of the Cost on all models, the most important features for minimum cost of telecommunication were identified. The application of these combined models will have the possibility to maximize the profit with the minimum expenditure for customer retention and help reduce customer churn rates.
机译:None

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号