首页> 外文会议>IEEE International Conference on Smart City >An Elastic Data Persisting Solution with High Performance for Spark
【24h】

An Elastic Data Persisting Solution with High Performance for Spark

机译:弹性数据持久性解决方案,具有高性能的火花

获取原文

摘要

With the increasing popularity of in-memory computing, Spark [1] has been highly successful in implementing large scale data intensive applications, especially for those that reuse data across multiple parallel operations. However due to the fact that Moore's Law has slowed down and memory resources are still costly, we presented an elastic data persisting solution for Spark, which enables data compression to save more heap space for JVM and reducing disk I/O throughput for faster data access. We mathematically derived the criteria for selecting the optimal data compression and persisting plan. Our evaluation of the preliminary prototype of this elastic data persisting solution shows that it can provide resource management recommendations by accounting for input data type, memory space and CPU resource, and can consistently yield high performance that accelerates Spark up to 6x.
机译:随着内存计算的普及,Spark [1]在实现大规模数据密集型应用方面一直非常成功,特别是对于重用跨多个并行操作的数据。但是,由于摩尔定的法律放缓和内存资源仍然昂贵,我们介绍了一种弹性数据持久的火花解决方案,这使得数据压缩能够为JVM节省更多堆空间,减少磁盘I / O吞吐量以获取更快的数据访问。我们数学地派生了选择最佳数据压缩和持久性计划的标准。我们对此弹性数据持久性解决方案的初步原型的评估表明,它可以通过占输入数据类型,内存空间和CPU资源来提供资源管理建议,并且可以一致地产生高度的高性能,可加速9倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号