Randomized algorithms for matrix operations.

机译：矩阵运算的随机算法。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this thesis, we present fast, Monte-Carlo algorithms for performing useful computations on large matrices (e.g. matrix multiplication, Singular Value Decomposition (SVD), etc.). Our algorithms are quite different from traditional numerical analysis approaches and generally fit the following model: we assume that the input matrices are prohibitively large to store in RAM and only external memory storage is possible. We are allowed to read the matrices a few times and keep a “sketch” of the matrices in RAM. We only allow constant processing time after reading each element of the matrices; computing the “sketch” should be a very fast process. Indeed, one of the contributions of our work is to demonstrate that a “sketch” consisting only of a random sample of rows and/or columns of the matrices is adequate for efficient approximations. A crucial issue is how to form the random sample; an obvious choice is “blind” sampling (e.g. uniform sampling). Instead, we use adaptive sampling, namely, in one pass through the data we compute sampling probabilities—we keep rows/columns with probability proportional to the square of their lengths—and, in a second pass, we draw the sample. Adaptive sampling offers substantial gains; it allows us to approximately solve problems in sparse matrices as well as dense matrices. After these two passes, we perform computations only on the “sketch” (rows/columns that we kept).; The main focus of this thesis is a thorough presentation of the algorithms; we also address the issue of their optimality. We expect our algorithms to be useful in a large variety of settings where large data sets are modelled by matrices; indeed, we demonstrate their power by using them as tools to design approximation algorithms for various problems (mostly Information Retrieval and Data Mining applications). We believe that the underlying principle of this thesis—using adaptive sampling to create “sketches” of the data in a small number of passes—constitutes an appealing and intuitive direction to address the size and nature of modern data sets.

机译：在本文中，我们提出了快速的Monte-Carlo算法，用于在大型矩阵上执行有用的计算（例如矩阵乘法，奇异值分解（SVD）等）。我们的算法与传统的数值分析方法完全不同，并且通常适合以下模型：我们假设输入矩阵太大而无法存储在RAM中，并且只能进行外部存储器存储。我们可以读取矩阵几次，并在RAM中保留矩阵的“草图”。在读取矩阵的每个元素后，我们只允许恒定的处理时间；计算“草图”应该是一个非常快速的过程。确实，我们工作的贡献之一就是证明仅由矩阵行和/或列的随机样本组成的“草图”足以进行有效逼近。一个关键问题是如何形成随机样本。一个明显的选择是“盲”抽样（例如统一抽样）。取而代之的是，我们使用自适应采样，即，在数据的一次传递中，我们计算采样概率—我们使行/列的概率与其长度的平方成正比—在第二遍中，我们绘制样本。自适应采样带来了可观的收益；它使我们能够近似解决稀疏矩阵和稠密矩阵中的问题。经过这两次之后，我们仅对“草图”（我们保留的行/列）进行计算。本文的重点是对算法的全面介绍。我们还解决了它们的最优性问题。我们希望我们的算法在各种设置中有用，在这些设置中，大型数据集是通过矩阵建模的；实际上，我们通过将它们用作设计各种问题的近似算法（主要是信息检索和数据挖掘应用程序）的工具来证明它们的功能。我们认为，本文的基本原理（使用自适应采样以少量通过创建数据的“草图”）构成了吸引人的直观方向，以解决现代数据集的大小和性质。

著录项

作者
Drineas, Petros S.;
展开▼
作者单位

Yale University.;

展开▼
授予单位 Yale University.;
学科 Computer Science.
学位 Ph.D.
年度 2003
页码 118 p.
总页数 118
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Reply to the article: Nicholas C. Dang, Abbas Ardehali, Brian A. Bruckner, Patrick E. Parrino, Daniel L. Gillen, Rachel W. Hoffman, Russell Spotnitz, Stephanie Cavoores, Ian J. Shorn, Roberto J. Manson, William D. Spotnitz. Prospective, multicenter, randomized, controlled trial evaluating the performance of a novel combination powder vs hemostatic matrix in cardiothoracic operations. J Card Surg. 2019;1-7. DOI: 10.1111/jocs.14376 [J] . Spazierer Daniel, Moersdorf Gerhard Trends in Ecology & Evolution . 2020,第5期

机译：回复文章：Nicholas C. Dang，Abbas Ardehali，Brian A. Bruckner，Patrick E. Parrino，Daniel L. Gillen，Rachel W. Hoffman，Russell Spotnitz，斯蒂尔斯塔尼茨，伊恩J.Chorn，Roberto J. Manson，William D. Spotnitz。前瞻性，多中心，随机，受控试验评估新型组合粉末的性能VS止血基质在心脏病作用中。 J卡Surg。 2019; 1-7。 DOI：10.1111 / JOCS.14376
2. Computational drug repositioning using low-rank matrix approximation and randomized algorithms [J] . Bioinformatics . 2018,第11期

机译：使用低秩矩阵近似和随机算法进行计算药物重新定位
3. EFFICIENT RANDOMIZED ALGORITHMS FOR THE FIXED-PRECISION LOW-RANK MATRIX APPROXIMATION [J] . Yu Wenjian, Gu Yu, Li Yaohang SIAM Journal on Matrix Analysis and Applications . 2018,第3期

机译：高效的随机算法，用于固定精度低级矩阵近似
4. Fast Parallel Randomized QR with Column Pivoting Algorithms for Reliable Low-Rank Matrix Approximations [C] . Jianwei Xiao, Ming Gu, Julien Langou 2017 IEEE 24th International Conference on High Performance Computing . 2017

机译：具有可靠的低秩矩阵逼近的具有列旋转算法的快速并行随机QR
5. Randomized Algorithms for Computing Low-Rank Matrix Approximation [D] . Zhang, Bolong. 2020

机译：用于计算低秩矩阵近似的随机算法
6. Addition of parenteral cefoxitin to regimen of oral antibiotics for elective colorectal operations. A randomized prospective study. [O] . D J Schoetz Jr, P L Roberts, J J Murray, 1990

机译：将肠胃外头孢西丁加到口服抗生素方案中以进行选择性结直肠手术。一项随机前瞻性研究。
7. Pass-Efficient Randomized Algorithms for Low-Rank Matrix Approximation Using Any Number of Views [O] . Elvar K. Bjarkason 2019

机译：使用任意数量的视图，通过高效的随机算法进行低秩矩阵近似

Randomized algorithms for matrix operations.

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅