首页> 外文会议>IEEE International Conference on Parallel and Distributed Systems >Utilizing SSD to Alleviate Chunk Fragmentation in De-Duplicated Backup Systems
【24h】

Utilizing SSD to Alleviate Chunk Fragmentation in De-Duplicated Backup Systems

机译:利用SSD缓解De-Pupled Backup Systems中的Chunk碎片

获取原文

摘要

Data deduplication, which removes redundant data so that only one copy of duplicate blocks needs to be actually stored, has been implemented in almost all storage appliances, including archival and back-up systems, primary data storage, and SSD devices, to save storage space. However, as time goes and more duplicate blocks have been ingested into the system, the fragmentation problem emerges, that is, logically continuous data blocks of later stored datasets are dispersed in a large storage space and as a result restoring them requires a lot of extra disk accesses, significantly degrading restore performance and garbage collection efficiency. Existing approaches toward the fragmentation problem choose to sacrifice space savings for performance by selectively rewriting trouble-causing duplicate blocks when performing deduplication, even though they have already been stored elsewhere previously. However, rewriting chunks into the system impacts the backup process and reduces deduplication efficiency as many duplicate chunks are allowed in the system. In this work, we propose to deploy flash-based SSDs in the system to overcome the limitations of rewriting algorithms by taking advantage of the high performance provided by SSDs. Specifically, instead of rewriting, we migrate the trouble-causing blocks into an SSD storage in the background when encountering duplicate blocks. The idea is mainly motivated by the following two reasons. First, using a separate migrating process leverages the computing power provided by modern multi-core architecture. Second, typically restores are not performed immediately after backups. Therefore, there is no need to rewrite blocks on the critical path, which affects performance. We augment our proposal to two rewriting schemes and conduct comprehensive evaluations to evaluate its efficacy. Our results show that by provisioning a reasonable amount of SSD, the backup performance and deduplication efficiency can be significantly improved, while slightly increasing the amount of container reads associated with restore operations.
机译:删除冗余数据的数据重复数据删除,以便只需要实际存储一个重复块的一份副本,几乎在所有存储设备中都实现,包括存档和备份系统,主数据存储和SSD设备,以保存存储空间。然而,随着时间的推移和更多的重复块被摄入到系统中,出现了碎片问题,即逻辑上存储的数据集的逻辑连续数据块分散在大存储空间中,并且结果恢复它们需要大量额外的结果磁盘访问,显着降低恢复性能和垃圾收集效率。碎片问题的现有方法选择通过在执行重复数据删除时选择性地重写故障块来牺牲性能的节省空间,即使它们已经在此之前已经存储在其他地方。但是,将块重写为系统会影响备份过程,并减少重复数据删除效率,因为系统中允许许多重复的块。在这项工作中,我们建议通过利用SSD提供的高性能来克服系统中的基于闪存的SSD来克服重写算法的局限性。具体而言,当遇到重复块时,我们将导致的故障块迁移到后台的SSD存储中。这个想法主要是由于以下两个原因的激励。首先,使用单独的迁移过程利用现代多核架构提供的计算能力。其次,通常在备份后立即恢复恢复。因此,不需要在关键路径上重写块,这会影响性能。我们将我们的提案增强了两项重写计划,并进行全面的评估,以评估其疗效。我们的结果表明,通过配置合理的SSD量,可以显着提高备份性能和重复数据删除效率,同时略微增加与恢复操作相关的容器读取量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号