首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion
【24h】

Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion

机译:阈值和最大差值矩阵完成的大规模稀疏逆协方差估计

获取原文
       

摘要

The sparse inverse covariance estimation problem is commonly solved using an $ell_{1}$-regularized Gaussian maximum likelihood estimator known as “graphical lasso”, but its computational cost becomes prohibitive for large data sets. A recently line of results showed{–}under mild assumptions{–}that the graphical lasso estimator can be retrieved by soft-thresholding the sample covariance matrix and solving a maximum determinant matrix completion (MDMC) problem. This paper proves an extension of this result, and describes a Newton-CG algorithm to efficiently solve the MDMC problem. Assuming that the thresholded sample covariance matrix is sparse with a sparse Cholesky factorization, we prove that the algorithm converges to an $epsilon$-accurate solution in $O(nlog(1/epsilon))$ time and $O(n)$ memory. The algorithm is highly efficient in practice: we solve the associated MDMC problems with as many as 200,000 variables to 7-9 digits of accuracy in less than an hour on a standard laptop computer running MATLAB.
机译:稀疏逆协方差估计问题通常使用称为“图形套索”的$ ell_ {1} $正则化的高斯最大似然估计器解决,但其计算成本对于大数据集变得难以承受。最近的结果显示{–}在温和的假设{–}下,可以通过对样本协方差矩阵进行软阈值处理并解决最大行列式矩阵完成(MDMC)问题来检索图形套索估计。本文证明了这一结果的扩展,并描述了一种能有效解决MDMC问题的Newton-CG算法。假设阈值样本协方差矩阵是稀疏的Cholesky因式分解,我们证明了算法在$ O(n log(1 / epsilon))$时间和$ O( n)$内存。该算法在实践中非常高效:在运行MATLAB的标准便携式计算机上,我们可以在不到一个小时的时间内用多达200,000个变量来解决相关的MDMC问题,其精度在7-9位数之间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号