首页> 外文会议>IFIP WG 10.3 international conference on network and parallel computing >CSAS: Cost-Based Storage Auto-Selection, a Fine Grained Storage Selection Mechanism for Spark
【24h】

CSAS: Cost-Based Storage Auto-Selection, a Fine Grained Storage Selection Mechanism for Spark

机译:CSAS:基于成本的存储自动选择,一种用于Spark的细粒度存储选择机制

获取原文

摘要

To improve system performance, Spark places the RDDs into memory for further access through the caching mechanism. And it provides a variety of storage levels to put cache RDDs. However, the RDD-grained manual storage level selection mechanism can not adjust depending on computing resources of the node. In this paper, we firstly present a fine-grained automatic storage level selection mechanism. And then we provide a storage level for a partition based on a cost model which fully considering the system resources status, compression and serialization costs. Experiments show that our approach can offer a up to 77% performance improvement compared to the default storage level scheme provided by Spark.
机译:为了提高系统性能,Spark将RDD放入内存中,以便通过缓存机制进行进一步访问。并且它提供了各种存储级别来放置缓存RDD。但是,RDD粒度的手动存储级别选择机制无法根据节点的计算资源进行调整。在本文中,我们首先提出了一种细粒度的自动存储级别选择机制。然后,我们基于成本模型为分区提供存储级别,该成本模型充分考虑了系统资源状态,压缩和序列化成本。实验表明,与Spark提供的默认存储级别方案相比,我们的方法可以将性能提高多达77%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号