首页> 外文会议>Algorithms - ESA 2006; Lecture Notes in Computer Science; 4168 >Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods
【24h】

Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods

机译:子空间采样和相对误差矩阵逼近:基于列行的方法

获取原文
获取原文并翻译 | 示例

摘要

Much recent work in the theoretical computer science, linear algebra, and machine learning has considered matrix decompositions of the following form: given an m x n matrix A, decompose it as a product of three matrices, C, U, and R, where C consists of a small number of columns of A, R consists of a small number of rows of A, and U is a small carefully constructed matrix that guarantees that the product CUR is "close" to A. Applications of such decompositions include the computation of matrix "sketches", speeding up kernel-based statistical learning, preserving sparsity in low-rank matrix representation, and improved in-terpretability of data analysis methods. Our main result is a randomized, polynomial algorithm which, given as input an m x n matrix A, returns as output matrices C, U, R such thatrn‖A-CUR‖_F≤(1 + ε)‖A-A_k‖_Frnwith probability at least 1 — δ. Here, A_k is the "best" rank-k approximation (provided by truncating the Singular Value Decomposition of A), and ‖X‖_F is the Frobenius norm of the matrix X. The number of columns in C and rows in R is a low-degree polynomial in k, 1/ε, and log(1/δ). Our main result is obtained by an extension of our recent relative error approximation algorithm for l_2 regression from overcon-strained problems to general l_2 regression problems. Our algorithm is simple, and it takes time of the order of the time needed to compute the top k right singular vectors of A. In addition, it samples the columns and rows of A via the method of "subspace sampling," so-named since the sampling probabilities depend on the lengths of the rows of the top singular vectors, and since they ensure that we capture entirely a certain subspace of interest.
机译:理论计算机科学,线性代数和机器学习方面的许多最新工作都考虑了以下形式的矩阵分解:给定一个mxn矩阵A,将其分解为三个矩阵C,U和R的乘积,其中C包括少量的A列,R包括少量的A行,而U是精心构造的小型矩阵,可确保乘积CUR与A接近。此类分解的应用包括“草图”,加快基于内核的统计学习,在低秩矩阵表示中保留稀疏性,并改善数据分析方法的可解释性。我们的主要结果是一个随机多项式算法,给定一个mxn矩阵A作为输入,它作为输出矩阵C,U,R返回,使得rn‖A-CUR‖_F≤(1 +ε)‖A-A_k‖_Frn的概率为至少1-δ。此处,A_k是“最佳”秩k逼近(通过截断A的奇异值分解提供),“ X” _F是矩阵X的Frobenius范数。C中的列数和R中的行数是a k,1 /ε和log(1 /δ)的低阶多项式。我们的主要结果是通过将我们最近的相对误差近似算法扩展到l_2回归(从过度约束的问题到一般的l_2回归问题)而获得的。我们的算法很简单,花费的时间大约是计算A的前k个右奇异向量所需的时间。此外,它通过“子空间采样”方法(即所谓的“空间采样”)对A的行和列进行采样因为采样概率取决于顶部奇异矢量的行的长度,并且由于它们确保了我们完全捕获特定的感兴趣子空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号