【24h】

Kernel Principal Component Analysis for Large Scale Data Set

机译:大规模数据集的内核主成分分析

获取原文
获取原文并翻译 | 示例

摘要

Kernel principal component analysis (KPCA) has provided an extremely powerful approach to extracting nonlinear features via kernel trick, and it has been suggested for a number of applications. Whereas the nonlinearity can be allowed by the utilization of Mercer kernels, the standard KPCA could only process limited number of training samples. For large scale data set, it may suffer from computational problem of diagonalizing large matrices, and occupy large storage space. In this paper, by choosing a subset of the entire training samples using Gram-Schmidt orthonormalization and incomplete Cholesky decomposition, we formulate KPCA as another eigenvalue problem of matrix whose size is much smaller than that of the kernel matrix. The theoretical analysis and experimental results on both artificial and real data have shown the advantages of the proposed method for performing KPCA in terms of computational efficiency and storage space, especially when the number of data points is large.
机译:内核主成分分析(KPCA)提供了一种非常强大的方法,可以通过内核技巧来提取非线性特征,并且已被建议用于许多应用程序。利用Mercer内核可以允许非线性,而标准KPCA只能处理有限数量的训练样本。对于大规模数据集,可能会遇到对角化大型矩阵的计算问题,并占用大量存储空间。在本文中,通过使用Gram-Schmidt正交归一化和不完全Cholesky分解选择整个训练样本的子集,我们将KPCA公式化为另一个矩阵的特征值问题,该问题的大小比核矩阵小得多。对人工数据和真实数据的理论分析和实验结果表明,该方法在计算效率和存储空间方面具有优势,尤其是在数据点数量较大时。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号