首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Improving the Performance of Deduplication-Based Storage Cache via Content-Driven Cache Management Methods
【24h】

Improving the Performance of Deduplication-Based Storage Cache via Content-Driven Cache Management Methods

机译:通过内容驱动的缓存管理方法提高基于重复数据删除的存储缓存的性能

获取原文
获取原文并翻译 | 示例
       

摘要

Data deduplication, as a proven technology for effective data reduction in backup and archiving storage systems, is also showing promises in increasing the logical space capacity for storage caches by removing redundant data. However, our in-depth evaluation of the existing deduplication-aware caching algorithms reveals that they only work well when the cached block size is set to 4 KB. Unfortunately, modern storage systems often set the block size to be much larger than 4 KB, and in this scenario, the overall performance of these caching schemes drops below that of the conventional replacement algorithms without any deduplication. There are several reasons for this performance degradation. The first reason is the deduplication overhead, which is the time spent on generating the data fingerprints and their use to identify duplicate data. Such overhead offsets the benefits of deduplication. The second reason is the extremely low cache space utilization caused by read and write alignment. The third reason is that existing algorithms only exploit access locality to identify block replacement. There is a lost opportunity to effectively leverage the content usage patterns such as intensity of content redundancy and sharing in deduplication-based storage caches to further improve performance. We propose CDAC, a Content-driven Deduplication-Aware Cache, to address this problem. CDAC focuses on exploiting the content redundancy in blocks and intensity of content sharing among source addresses in cache management strategies. We have implemented CDAC based on LRU and ARC algorithms, called CDAC-LRU and CDAC-ARC respectively. Our extensive experimental results show that CDAC-LRU and CDAC-ARC outperform the state-of-the-art deduplication-aware caching algorithms, D-LRU, and D-ARC, by up to 23.83X in read cache hit ratio, with an average of 3.23X, and up to 53.3 percent in IOPS, with an average of 49.8 percent, under a real-world mixed workload when the cache size ranges from 20 to 50 percent of the workload size and the block size ranges from 4KB to 32 KB.
机译:数据重复数据删除,作为备份和存档存储系统的有效数据减少的经过验证的技术,也显示了通过删除冗余数据来增加存储缓存的逻辑空间容量。但是,我们对现有的重复数据删除感知缓存算法的深入评估显示,当缓存的块大小设置为4 kB时,它们仅适用于它们。不幸的是,现代存储系统通常将块大小设置为大于4 kB,在这种情况下,这些缓存方案的整体性能下降到传统替换算法的情况下,没有任何重复数据删除。这种性能下降有几个原因。第一个原因是重复数据删除开销,这是在生成数据指纹的时间及其用于识别重复数据的时间。这种开销抵消了重复数据删除的好处。第二个原因是由读写对齐引起的极低缓存空间利用率。第三种原因是现有算法仅利用访问局部标识块替换。有一个失去的机会,有效利用内容使用模式,例如内容冗余的强度,并在基于重复数据删除的存储缓存中共享,以进一步提高性能。我们提出了CDAC,一个内容驱动的重复数据删除感知缓存,以解决此问题。 CDAC侧重于利用缓存管理策略中的源地址块和内容共享的内容共享的内容冗余。我们已经基于LRU和ARC算法的CDAC,分别称为CDAC-LRU和CDAC-arc。我们广泛的实验结果表明,CDAC-LRU和CDAC-ARC优于最先进的重复数据删除感知缓存算法,D-LRU和D-arc,读取高速缓存命中率最高23.83倍,有一个IOPS的平均水平为3.23倍,IOPS高达53.3%,平均为49.8%,在实际的混合工作量下,当高速缓存大小为工作负载大小的20%到50%,块大小范围为4kb到32 KB。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号