Analysis of imbalanced data set problem: The case of churn prediction for telecommunication

Chun Gui

首页> 外文期刊>Artificial Intelligence Research >Analysis of imbalanced data set problem: The case of churn prediction for telecommunication

【24h】

Analysis of imbalanced data set problem: The case of churn prediction for telecommunication

机译：不平衡数据集问题的分析：电信搅拌预测的情况

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Class-imbalanced datasets are common in the field of mobile Internet industry. We tested three kinds of feature selection techniques-Random Forest (RF), Relative Weight (RW) and Standardized Regression Coefficients (SRC); three kinds of balance methods-over-sampling (OS), under-sampling (US) and synthetic minority over-sampling (SMOTE); a widely used classification method-RF. The combined models are composed of feature selection techniques, balancing techniques and classification method. The original dataset which has 45 thousand records and 22 features were used to evaluate the performances of both feature selection and balancing techniques. The experimental results revealed that SRC combined with SMOTE technique attained the minimum value of Cost = 1085. Through the calculation of the Cost on all models, the most important features for minimum cost of telecommunication were identified. The application of these combined models will have the possibility to maximize the profit with the minimum expenditure for customer retention and help reduce customer churn rates.

机译：None

著录项

来源
《Artificial Intelligence Research》 |2017年第2期|共7页
作者
Chun Gui;
展开▼
作者单位

College of mathematics and computer science Northwest University for Nationalities;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词
Churn prediction; Class-imbalanced dataset; Random forest; Synthetic minority over-sampling; Cost; Customer retention;

机译：流失预测;类 - 不平衡数据集;随机森林;合成少数群体过度采样;成本;客户保留;

相似文献

外文文献
中文文献
专利

1. Mining and Integrating Reliable Decision Rules for Imbalanced Cancer Gene Expression Data Sets [J] . Hualong Yu, Jun Ni, Yuanyuan Dan, 清华大学学报（英文版） . 2012,第006期
2. Analysis of imbalanced data set problem: The case of churn prediction for telecommunication [J] . Chun Gui Artificial Intelligence Research . 2017,第2期

机译：不平衡数据集问题的分析：电信搅拌预测的情况
3. A comparative analysis of data preparation algorithms for customer churn prediction: A case study in the telecommunication industry [J] . Coussement Kristof, Lessmann Stefan, Verstraeten Geert Decision support systems . 2017,第MARa期

机译：用于客户流失预测的数据准备算法的比较分析：以电信行业为例
4. Computing efficient features using rough set theory combined with ensemble classification techniques to improve the customer churn prediction in telecommunication sector [J] . J. Vijaya, E. Sivasankar Computing . 2018,第8期

机译：使用粗糙集理论和集成分类技术来计算有效特征，以改善电信行业的客户流失预测
5. Supervised Massive Data Analysis for Telecommunication Customer Churn Prediction [C] . Hui Li, Deliang Yang, Lingling Yang, 2016 IEEE International Conferences on Big Data and Cloud Computing, Social Computing and Networking, Sustainable Computing and Communication . 2016

机译：有监督的海量数据分析，用于电信客户流失预测
6. Fault Detection Framework for Imbalanced and Sparsely-Labeled Data Sets Using Self-Organizing Maps [D] . Shah, Rushit N. 2018

机译：使用自组织地图的Imbalanced和Sparars标记的数据集故障检测框架
7. Social Network Analysis and Churn Prediction in Telecommunications Using Graph Theory [O] . Stefan M. Kostić, Mirjana I. Simić, Miroljub V. Kostić 2020

机译：图论电信中的社会网络分析与搅拌预测
8. Analysis of imbalanced data set problem: The case of churn prediction for telecommunication [O] . Chun Gui 2017

机译：不平衡数据集问题分析：电信搅拌预测的情况
9. Rapid Prediction of Trauma Patient Survival by Analysis of Heart Rate Complexity: Impact of Reducing Data Set Size. [R] . Batchinsky, A. I., Salinas, J., Kuusela, T., 2009

机译：通过心率复杂性分析快速预测创伤患者的存活率：减少数据集大小的影响。

Analysis of imbalanced data set problem: The case of churn prediction for telecommunication

摘要

著录项

相似文献

相关主题

期刊订阅