首页> 外文期刊>Industrial Engineering & Management Systems >Comparative Study of Dimension Reduction Methods for Highly Imbalanced Overlapping Churn Data
【24h】

Comparative Study of Dimension Reduction Methods for Highly Imbalanced Overlapping Churn Data

机译:高度不平衡重叠流失数据的尺寸减少方法的比较研究

获取原文
       

摘要

Retention of possible churning customer is one of the most important issues in customer relationship management, so companies try to predict churn customers using their large-scale high-dimensional data. This study focuses on dealing with large data sets by reducing the dimensionality. By using six different dimension reduction methods—Principal Component Analysis (PCA), factor analysis (FA), locally linear embedding (LLE), local tangent space alignment (LTSA), locally preserving projections (LPP), and deep auto-encoder—our experiments apply each dimension reduc- tion method to the training data, build a classification model using the mapped data and then measure the performance using hit rate to compare the dimension reduction methods. In the result, PCA shows good performance despite its simplicity, and the deep auto-encoder gives the best overall performance. These results can be explained by the char- acteristics of the churn prediction data that is highly correlated and overlapped over the classes. We also proposed a simple out-of-sample extension method for the nonlinear dimension reduction methods, LLE and LTSA, utilizing the characteristic of the data.
机译:保留可能的搅拌客户是客户关系管理中最重要的问题之一,因此公司试图使用其大规模的高维数据预测流失客户。本研究专注于通过降低维度来处理大数据集。通过使用六种不同的尺寸减少方法 - 主成分分析(PCA),因子分析(FA),局部线性嵌入(LLE),局部切线空间对齐(LTSA),局部保留投影(LPP)和深自动编码器 - 我们的实验将每个维度冗余方法应用于训练数据,使用映射数据构建分类模型,然后使用命中率测量性能以比较尺寸减少方法。结果,尽管其简单性,PCA显示出良好的性能,并且深度自动编码器提供了最佳整体性能。这些结果可以通过高度相关性并在类上重叠的流失预测数据的Char-行动来解释。我们还提出了一种用于非线性尺寸减少方法,LLE和LTSA的简单外扩展方法,利用数据的特性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号