...
首页> 外文期刊>The Annals of Statistics: An Official Journal of the Institute of Mathematical Statistics >SPECTRUM ESTIMATION FOR LARGE DIMENSIONAL COVARIANCE MATRICES USING RANDOM MATRIX THEORY
【24h】

SPECTRUM ESTIMATION FOR LARGE DIMENSIONAL COVARIANCE MATRICES USING RANDOM MATRIX THEORY

机译:基于随机矩阵理论的大尺度协方差矩阵的谱估计

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Estimating the eigenvalues of a population covariance matrix from a sample covariance matrix is a problem of fundamental importance in multivariate statistics; the eigenvalues of covariance matrices play a key role in many widely used techniques, in particular in principal component analysis (PCA). In many modern data analysis problems, statisticians are faced with large datasets where the sample size, n, is of the same order of magnitude as the number of variables p. Random matrix theory predicts that in this context, the eigenvalues of the sample covariance matrix are not good estimators of the eigenvalues of the population covariance. We propose to use a fundamental result in random matrix theory, the Marcenko-Pastur equation, to better estimate the eigenvalues of large dimensional covariance matrices. The Marcenko-Pastur equation holds in very wide generality and under weak assumptions. The estimator we obtain can be thought of as "shrinking" in a nonlinear fashion the eigenvalues of the sample covariance matrix to estimate the population eigenvalues. Inspired by ideas of random matrix theory, we also suggest a change of point of view when thinking about estimation of high-dimensional vectors: we do not try to estimate directly the vectors but rather a probability measure that describes them. We think this is a theoretically more fruitful way to think about these problems. Our estimator gives fast and good or very good results in extended simulations. Our algorithmic approach is based on convex optimization. We also show that the proposed estimator is consistent.
机译:从样本协方差矩阵估计总体协方差矩阵的特征值是多元统计中最重要的问题。协方差矩阵的特征值在许多广泛使用的技术中,尤其是在主成分分析(PCA)中起着关键作用。在许多现代数据分析问题中,统计学家都面临着大型数据集,其中样本大小n与变量p的数量级相同。随机矩阵理论预测,在这种情况下,样本协方差矩阵的特征值不是总体协方差特征值的良好估计。我们建议在随机矩阵理论中使用基本结果,即Marcenko-Pastur方程,以更好地估计大尺寸协方差矩阵的特征值。 Marcenko-Pastur方程具有很宽泛的通用性,并且假设条件很弱。可以将我们获得的估计器视为以非线性方式“收缩”样本协方差矩阵的特征值以估计总体特征值。受随机矩阵理论的启发,在考虑高维向量的估计时,我们还建议改变观点:我们不尝试直接估计向量,而是描述它们的概率度量。我们认为这是从理论上更有效地考虑这些问题的方式。我们的估算器在扩展仿真中给出了快速,良好或非常好的结果。我们的算法方法基于凸优化。我们还表明,提出的估计量是一致的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号