首页> 外文会议>IEEE International Conference on Cloud Computing Technology and Science >VOLUME: Enable Large-Scale In-Memory Computation on Commodity Clusters

【24h】

VOLUME: Enable Large-Scale In-Memory Computation on Commodity Clusters

机译：卷：在商品集群上启用大规模的内存计算

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Traditional cloud computing technologies, such as MapReduce, use file systems as the system-wide substrate for data storage and sharing. A distributed file system provides a global name space and stores data persistently, but it also introduces significant overhead. Several recent systems use DRAM to store data and tremendously improve the performance of cloud computing systems. However, both our own experience and related work indicate that a simple substitution of distributed DRAM for the file system does not provide a solid and viable foundation for data storage and processing in the data center environment, and the capacity of such systems is limited by the amount of physical memory in the cluster. To overcome the challenge, we construct VOLUME (Virtual On-Line Unified Memory Environment), a distributed virtual memory to unify the physical memory and disk resources on many compute nodes, to form a system-wide data substrate. The new substrate provides a general memory based abstraction, takes advantage of DRAM in the system to accelerate computation, and, transparent to programmers, scales the system to handle large datasets by swapping data to disks and remote servers. The evaluation results show that VOLUME is much faster than Hadoop/HDFS, and delivers 6-11x speedups on the adjacency list workload. VOLUME is faster than both Hadoop/HDFS and Spark/RDD for in-memory sorting. For kmeans clustering, VOLUME scales linearly to 160 compute nodes on the TH-1/GZ supercomputer.

机译：传统的云计算技术，如mapreduce，使用文件系统作为系统存储和共享的系统范围的基板。分布式文件系统提供全球名称空间并持久地存储数据，但它也引入了显着的开销。最近的几个系统使用DRAM存储数据并大大提高云计算系统的性能。但是，我们自己的经验和相关工作都表明，对于文件系统的分布式DRAM的简单替代，没有为数据中心环境中的数据存储和处理提供了一个坚实的和可行的基础，并且这些系统的容量受到限制集群中的物理内存量。为了克服挑战，我们构建卷（虚拟在线统一内存环境），分布式虚拟内存，以统一许多计算节点上的物理内存和磁盘资源，以形成系统宽的数据基板。新的基板提供了一种基于一般的内存的抽象，利用了系统中的DRAM来加速计算，并且对程序员透明，透明地通过将数据交换到磁盘和远程服务器来处理大型数据集。评估结果表明，体积比Hadoop / HDFS快得多，并在邻接列表工作负载上提供6-11x的加速。卷比Hadoop / HDFS和Spark / RDD都速度更快。对于kmeans聚类，卷在TH-1 / GZ超级计算机上线性地缩放到160个计算节点。

著录项

来源
《IEEE International Conference on Cloud Computing Technology and Science 》|2013年||共8页
会议地点
作者
Ma Zhiqiang; Hong Ke; Gu Lin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术 ;
关键词

相似文献

外文文献
中文文献
专利

1. Ignite-GPU: a GPU-enabled in-memory computing architecture on clusters [J] . Sojoodi Amir Hossein, Salimi Beni Majid, Khunjush Farshad Journal of supercomputing . 2021 ,第3期

机译：Ignite-GPU：集群上启用了一种GPU的内存内存计算架构
2. Hyperbolic Embedding for Efficient Computation of Path Centralities and Adaptive Routing in Large-Scale Complex Commodity Networks [J] . Eleni Stai, Konstantinos Sotiropoulos, Vasileios Karyotis, Network Science and Engineering, IEEE Transactions on . 2017 ,第3期

机译：大型复杂商品网络中路径中心点的高效计算和自适应路由的双曲嵌入
3. Computational intelligence-based energy management for a large-scale PHEV/PEV enabled municipal parking deck [J] . Wencong Su, Mo-Yuen Chow Applied Energy . 2012 ,第期

机译：用于大型PHEV / PEV的市政停车场的基于计算智能的能源管理
4. VOLUME: Enable Large-Scale In-Memory Computation on Commodity Clusters [C] . Ma Zhiqiang, Hong Ke, Gu Lin IEEE International Conference on Cloud Computing Technology and Science . 2013

机译：卷：在商品群集上启用大规模内存中计算
5. Energy-efficient computation and communication scheduling for cluster-based in-network processing in large-scale wireless sensor networks. [D] . Tian, Yuan. 2006

机译：大规模无线传感器网络中基于群集的网络内处理的节能计算和通信调度。
6. SSCC: A Novel Computational Framework for Rapid and Accurate Clustering Large-scale Single Cell RNA-seq Data [O] . Xianwen Ren, Liangtao Zheng, Zemin Zhang 2019

机译：SSCC：一种新型的计算框架用于快速准确地聚类大规模单细胞RNA-seq数据
7. Overlapping Computation and Communication in SMT Clusters with Commodity Interconnects [O] . Georgios Goumas, Nikos Anastopoulos, Nectarios Koziris, 2015

机译：具有商品互连的smT集群中的重叠计算和通信

VOLUME: Enable Large-Scale In-Memory Computation on Commodity Clusters

摘要

著录项

相似文献

相关主题

期刊订阅