首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Crocus: Enabling Computing Resource Orchestration for Inline Cluster-Wide Deduplication on Scalable Storage Systems
【24h】

Crocus: Enabling Computing Resource Orchestration for Inline Cluster-Wide Deduplication on Scalable Storage Systems

机译:Crocus:启用计算资源编程,在可伸缩存储系统上为内联群集重复数据删除

获取原文
获取原文并翻译 | 示例

摘要

Inline deduplication dramatically improves storage space utilization. However, it degrades I/O throughput due to compute-intensive deduplication operations such as chunking, fingerprinting or hashing of chunk content, and redundant lookup I/Os over the network in the I/O path. In particular, the fingerprint or hash generation of content contributes largely to the degraded I/O throughput and is computationally expensive. In this article, we propose Crocus, a framework that enables compute resource orchestration to enhance cluster-wide deduplication performance. In particular, Crocus takes into account all compute resources such as local and remote {CPU, GPU} by managing decentralized compute pools. An opportunistic Load-Aware Fingerprint Scheduler (LAFS), distributes and offloads compute-intensive deduplication operations in a load-aware fashion to compute pools. Crocus is highly generic and can be adopted in both inline and offline deduplication with different storage tier configurations. We implemented Crocus in Ceph scale-out storage system. Our extensive evaluation shows that Crocus reduces the fingerprinting overhead by 86 percent with 4KB chunk size compared to Ceph with baseline deduplication while maintaining high disk-space savings. Our proposed LAFS scheduler, when tested in different internal and external contention scenarios also showed 54 percent improvement over a fixed or static scheduling approach.
机译:内联重复数据删除显着提高了存储空间利用率。但是,由于计算密集型重复数据删除操作,例如Chuncking,Compucking或散列在I / O路径中的网络上的冗余查找I / O / O / O / O / O / O,降低I / O吞吐量。特别是,内容的指纹或散列生成在很大程度上贡献了降级的I / O吞吐量,并且计算得昂贵。在本文中,我们提出了一种框架,这是一个框架,它可以计算资源编程以增强群集重复数据删除性能。特别是,Crocus考虑了通过管理分散的计算池来考虑所有计算资源,例如本地和远程{CPU,GPU}。机会主义的负载感知指纹调度程序(LAF),以负载感知方式分发和卸载计算密集型重复数据删除操作以计算池。 Crocus是高度通用的,可以在内联和离线重复数据删除中采用不同的存储层配置。我们在Ceph截止存储系统中实施了Crocus。我们的广泛评估表明,与基线重复数据删除的Ceph相比,番红花将指纹开销减少了86%,与基线重复数据删除,同时保持高磁盘空间节省。我们所提出的Lafs调度程序,当在不同的内部和外部争用方案中进行测试时,还显示了固定或静态调度方法的54%的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号