首页> 外文会议>International Conference on Information and Communication Technology >Handling imbalanced data in customer churn prediction using combined sampling and weighted random forest
【24h】

Handling imbalanced data in customer churn prediction using combined sampling and weighted random forest

机译:使用组合采样和加权随机森林处理客户流失预测中的不平衡数据

获取原文
获取外文期刊封面目录资料

摘要

Customer churn is a major problem that is found in the telecommunications industry because it affects the company's revenue. At the time of the customer churn is taking place, the percentage of data that describes the customer churn is usually low. Unfortunately, the churn data is the data which have to be predicted earlier. The lack of data on customer churn led to the problem of imbalanced data. The imbalanced data caused difficulties in developing a good prediction model. This research applied a combination of sampling techniques and Weighted Random Forest (WRF) to improve the customer churn prediction model on a sample dataset from a telecommunication industry in Indonesia. WRF claimed can produce a prediction model which has a good performance on the imbalanced data problem. However, this research found that the performance of the prediction model developed by WRF using the dataset is still quite low. Sampling techniques were applied to overcome this problem. This research used the combination of simple under sampling and SMOTE. The result shown that the combined-sampling and WRF could produce a prediction model which had better performance than before.
机译:客户流失是电信行业发现的一个主要问题,因为它影响公司的收入。在发生客户流失时,描述客户流失的数据百分比通常很低。不幸的是,流失数据是必须提前预测的数据。客户流失方面的数据不足导致数据不平衡的问题。数据不平衡导致难以建立良好的预测模型。这项研究应用了采样技术和加权随机森林(WRF)的组合,以改善印度尼西亚电信行业的样本数据集上的客户流失预测模型。声称的WRF可以产生一个对不平衡数据问题具有良好性能的预测模型。但是,这项研究发现WRF使用数据集开发的预测模型的性能仍然很低。应用了采样技术来克服这个问题。这项研究结合了简单欠采样和SMOTE。结果表明,结合采样和WRF可以产生一个比以前具有更好性能的预测模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号