【24h】

SAUD: Semantics-Aware and Utility-Driven Deduplication Framework for Primary Storage

机译:SAUD:主存储的语义感知和实用程序驱动的重复数据删除框架

获取原文
获取原文并翻译 | 示例

摘要

Data deduplication is an efficient technology to reduce storage cost for cloud storage systems, especially when massive volume of data has become normalcy in this era of Big Data. Primary storage, as the direct interaction layer with service users, has reaped the benefit of deduplication technologies due to its expensive manufacturing cost. However, since primary storage is constantly accessed by users, workloads of primary storage systems are mostly latency-sensitive. Such workload feature makes it challenging to develop both performance and space efficient deduplication schemes for primary storage systems. Existing deduplication schemes on primary storage pay little attention to achieving desirable space saving while restraining the inherent performance penalty to a little extent. In this paper, we propose SAUD, a Semantics-Aware and Utility-Driven deduplication framework to provide high space saving with minor performance penalty for primary storage. SAUD delivers performance-oriented deduplication service by leveraging the file-level semantics of primary storage in a quantitative way. SAUD calculates deduplication priority of files with diverse semantics as deduplicating instructions. Moreover, SAUD operates in a selective-on mode by dynamically regulating the deduplication process based on the real-time workload and system status, further reducing the side-effect on system performance. Comprehensive evaluations show that SAUD outperforms all other comparative schemes on system read performance by an average of 54.6%. SAUD manages to achieve 82.1% of the space efficiency achieved by the most space-orient scheme, read performance of which falls behind that of SAUD by as much as 80.1%.
机译:重复数据删除是一种有效的技术,可以降低云存储系统的存储成本,尤其是在大数据时代,海量数据已成为常态时。主存储作为与服务用户的直接交互层,由于其昂贵的制造成本而获得了重复数据删除技术的好处。但是,由于用户不断访问主存储,因此主存储系统的工作负载大多对延迟敏感。这种工作负载功能使开发针对主存储系统的性能和空间高效的重复数据删除方案面临挑战。主存储上的现有重复数据删除方案很少注意实现所需的空间节省,同时在一定程度上限制了固有的性能损失。在本文中,我们提出了SAUD,这是一种语义感知和实用程序驱动的重复数据删除框架,可以节省大量空间,并且对主存储的性能影响较小。 SAUD通过定量利用主存储的文件级语义来提供面向性能的重复数据删除服务。 SAUD计算具有重复语义的文件的重复数据删除优先级作为重复数据删除指令。此外,SAUD通过基于实时工作负载和系统状态动态调节重复数据删除过程以选择性开启模式运行,从而进一步降低了对系统性能的副作用。综合评估表明,SAUD在系统读取性能方面优于所有其他比较方案,平均为54.6%。 SAUD设法实现了最面向空间的方案所实现的空间效率的82.1%,其读取性能落后于SAUD高达80.1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号