...
首页> 外文期刊>Statistics and Its Interface >Sparse generalized principal component analysis for large-scale applications beyond Gaussianity
【24h】

Sparse generalized principal component analysis for large-scale applications beyond Gaussianity

机译:稀疏的广义主成分分析,适用于高斯以外的大规模应用

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Principal Component Analysis (PCA) is a dimension reduction technique. It produces inconsistent estimators when the dimensionality is moderate to high, which is often the problem in modern large-scale applications where algorithm scalability and model interpretability are difficult to achieve, not to mention the prevalence of missing values. While existing sparse PCA methods alleviate inconsistency, they are constrained to the Gaussian assumption of classical PCA and fail to address algorithm scalability issues. We generalize sparse PCA to the broad exponential family distributions under high-dimensional setup, with built-in treatment for missing values. Meanwhile, we propose a family of iterative sparse generalized PCA (SG-PCA) algorithms such that despite the non-convexity and non-smoothness of the optimization task, the loss function decreases in every iteration. In terms of ease and intuitive parameter tuning, our sparsity-inducing regularization is far superior to the popular Lasso. Furthermore, to promote overall scalability, accelerated gradient is integrated for fast convergence, while a progressive screening technique gradually squeezes out nuisance dimensions of a large-scale problem for feasible optimization. High-dimensional simulation and real data experiments demonstrate the efficiency and efficacy of SG-PCA.
机译:主成分分析(PCA)是一种降维技术。当维数从中到高时,它会产生不一致的估计量,这在现代大规模应用中经常会出现问题,在这些大规模应用中,难以实现算法的可伸缩性和模型可解释性,更不用说丢失值的普遍性了。尽管现有的稀疏PCA方法可以缓解不一致之处,但它们仅限于经典PCA的高斯假设,并且无法解决算法的可伸缩性问题。我们将稀疏PCA推广到高维设置下的广泛指数族分布,并针对缺失值进行内置处理。同时,我们提出了一系列的迭代稀疏广义PCA(SG-PCA)算法,使得尽管优化任务具有非凸性和非平滑性,但是损失函数在每次迭代中都会减少。在轻松和直观的参数调整方面,我们的稀疏性正则化远远优于流行的套索。此外,为了提高整体可扩展性,集成了加速梯度以实现快速收敛,而渐进式筛选技术则逐渐排除了大规模问题的烦人之处,以进行可行的优化。高维仿真和真实数据实验证明了SG-PCA的效率和功效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号