CUR matrix decompositions for improved data analysis

机译：CUR矩阵分解可改善数据分析

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Principal components analysis and, more generally, the Singular Value Decomposition are fundamental data analysis tools that express a data matrix in terms of a sequence of orthogonal or uncorrelated vectors of decreasing importance. Unfortunately, being linear combinations of up to all the data points, these vectors are notoriously difficult to interpret in terms of the data and processes generating the data. In this article, we develop CUR matrix decompositions for improved data analysis. CUR decompositions are low-rank matrix decompositions that are explicitly expressed in terms of a small number of actual columns and/or actual rows of the data matrix. Because they are constructed from actual data elements, CUR decompositions are interpretable by practitioners of the field from which the data are drawn (to the extent that the original data are). We present an algorithm that preferentially chooses columns and rows that exhibit high “statistical leverage” and, thus, in a very precise statistical sense, exert a disproportionately large “influence” on the best low-rank fit of the data matrix. By selecting columns and rows in this manner, we obtain improved relative-error and constant-factor approximation guarantees in worst-case analysis, as opposed to the much coarser additive-error guarantees of prior work. In addition, since the construction involves computing quantities with a natural and widely studied statistical interpretation, we can leverage ideas from diagnostic regression analysis to employ these matrix decompositions for exploratory data analysis.

机译：主成分分析以及更普遍的奇异值分解是基础数据分析工具，它们根据重要性降低的正交或不相关向量序列来表示数据矩阵。不幸的是，由于这些向量是最多所有数据点的线性组合，因此就数据和生成数据的过程而言，众所周知这些向量很难解释。在本文中，我们开发了CUR矩阵分解以改进数据分析。 CUR分解是低秩矩阵分解，明确表示为数据矩阵的少量实际列和/或实际行。由于CUR分解是由实际的数据元素构成的，因此可以从中提取数据的领域的从业人员解释CUR分解（以原始数据为限）。我们提出了一种算法，该算法优先选择表现出高“统计杠杆作用”的列和行，因此，在非常精确的统计意义上，会对数据矩阵的最佳低秩拟合施加不成比例的大“影响力”。通过以这种方式选择列和行，我们在最坏情况的分析中获得了改进的相对误差和恒定因子近似保证，而以前的工作则没有那么粗略的加法误差保证。此外，由于构造涉及使用自然且经过广泛研究的统计解释来计算量，因此我们可以利用诊断回归分析中的思想将这些矩阵分解用于探索性数据分析。

著录项

期刊名称 Proceedings of the National Academy of Sciences of the United States of America
作者
Michael W. Mahoney; Petros Drineas;
展开▼
作者单位

展开▼
年(卷),期 2009(106),3
年度 2009
页码 697–702
总页数 6
原文格式 PDF
正文语种
中图分类
关键词
randomized algorithms singular value decomposition principal components analysis interpretation statistical leverage;

机译：随机算法;奇异值分解;主成分分析;解释;统计杠杆;

相似文献

外文文献
中文文献
专利

1. Cur Matrix Decompositions For Improved Data Analysis [J] . Michael W. Mahoney, Petros Drineas Proceedings of the National Academy of Sciences of the United States of America . 2009,第3期

机译：Cur矩阵分解可改善数据分析
2. Promote the Compression Efficiency of Digital Images by Using Improved CUR Matrix Decomposition Algorithm [J] . Qinghai Jin 现代电子技术(英文) . 2019,第001期

机译：利用改进的CUR矩阵分解算法提高数字图像的压缩效率
3. Improving CUR Matrix Decomposition and the Nystrom Approximation via Adaptive Sampling [J] . Shusen Wang, Zhihua Zhang Journal of machine learning research . 2013,第Apr期

机译：通过自适应采样改善CUR矩阵分解和Nystrom近似
4. Sparse representation of hyperspectral data using CUR matrix decomposition [C] . Sigurdsson J., Ulfarsson M.O., Sveinsson J.R., IEEE International Geoscience and Remote Sensing Symposium . 2013

机译：使用CUR矩阵分解的高光谱数据的稀疏表示
5. Comparing principal component analysis, singular value decomposition and non-negative matrix factorization using U.S. mortality data [D] . Hapuwitharana, J. C. 2014

机译：使用美国死亡率数据比较主成分分析，奇异值分解和非负矩阵分解
6. Identification of candidate drugs using tensor-decomposition-based unsupervised feature extraction in integrated analysis of gene expression between diseases and DrugMatrix datasets [O] . Y.-h. Taguchi -1

机译：在基于疾病和DrugMatrix数据集的基因表达集成分析中使用基于张量分解的无监督特征提取来识别候选药物
7. CUR matrix decompositions for improved data analysis [O] . Mahoney, Michael W., Drineas, Petros 2009

机译：CUR矩阵分解可改善数据分析

CUR matrix decompositions for improved data analysis

摘要

著录项

相似文献

相关主题

期刊订阅