To deal with the computational and storage problem for the large-scale data set, an improved Kernel Principal Component Analysis based on 1-order and 2-order statistical quantity, is proposed. By dividing the large scale data set into small subsets, we could treat 1-order and 2-order statistical quantity (mean and autocorrelation matrix) of each subset as the special computational unit. A novel polynomial-matrix kernel function is also adopted to compute the similarity between the data matrices in place of vectors. The proposed method can greatly reduce the size of kernel matrix, which makes its computation possible. Its effectiveness is demonstrated by the experimental results on the artificial and real data set.
展开▼