首页> 外文会议>International conference on very large data bases >Compressed Linear Algebra for Large-Scale Machine Learning
【24h】

Compressed Linear Algebra for Large-Scale Machine Learning

机译:大规模机器学习的压缩线性代数

获取原文

摘要

Large-scale machine learning (ML) algorithms are often iterative, using repeated read-only data access and I/O-bound matrix-vector multiplications to converge to an optimal model. It is crucial for performance to fit the data into single-node or distributed main memory. General-purpose, heavy- and lightweight compression techniques struggle to achieve both good compression ratios and fast decompression speed to enable block-wise uncompressed operations. Hence, we initiate work on compressed linear algebra (CLA), in which lightweight database compression techniques are applied to matrices and then linear algebra operations such as matrix-vector multiplication are executed directly on the compressed representations. We contribute effective column compression schemes, cache-conscious operations, and an efficient sampling-based compression algorithm. Our experiments show that CLA achieves in-memory operations performance close to the uncompressed case and good compression ratios that allow us to fit larger datasets into available memory. We thereby obtain significant end-to-end performance improvements up to 26x or reduced memory requirements.
机译:大型机器学习(ML)算法通常是迭代的,使用重复的只读数据访问和I / O绑定的矩阵矢量乘法来收敛到最佳模型。将数据装入单节点或分布式主内存对于性能至关重要。通用,沉重和轻量级压缩技术难以同时实现良好的压缩率和快速的解压缩速度,以实现逐块未压缩的操作。因此,我们开始进行压缩线性代数(CLA)的工作,其中将轻量级数据库压缩技术应用于矩阵,然后直接在压缩表示形式上执行线性代数运算(例如矩阵向量乘法)。我们提供了有效的列压缩方案,注重缓存的操作以及有效的基于采样的压缩算法。我们的实验表明,CLA实现了接近未压缩情况的内存操作性能以及良好的压缩率,使我们能够将较大的数据集放入可用的内存中。因此,我们获得了高达26倍的端到端性能改进或减少了内存需求。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号