首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Reducing Fragmentation for In-line Deduplication Backup Storage via Exploiting Backup History and Cache Knowledge
【24h】

Reducing Fragmentation for In-line Deduplication Backup Storage via Exploiting Backup History and Cache Knowledge

机译:通过利用备份历史记录和缓存知识减少在线重复数据删除备份存储的碎片

获取原文
获取原文并翻译 | 示例

摘要

In backup systems, the chunks of each backup are physically scattered after deduplication, which causes a challenging fragmentation problem. We observe that the fragmentation comes into sparse and out-of-order containers. The sparse container decreases restore performance and garbage collection efficiency, while the out-of-order container decreases restore performance if the restore cache is small. In order to reduce the fragmentation, we propose History-Aware Rewriting algorithm (HAR) and Cache-Aware Filter (CAF). HAR exploits historical information in backup systems to accurately identify and reduce sparse containers, and CAF exploits restore cache knowledge to identify the out-of-order containers that hurt restore performance. CAF efficiently complements HAR in datasets where out-of-order containers are dominant. To reduce the metadata overhead of the garbage collection, we further propose a Container-Marker Algorithm (CMA) to identify valid containers instead of valid chunks. Our extensive experimental results from real-world datasets show HAR significantly improves the restore performance by 2.84-175.36 at a cost of only rewriting 0.5-2.03 percent data.
机译:在备份系统中,每个备份的大块在重复数据删除后会在物理上分散开来,这会带来极具挑战性的碎片问题。我们观察到碎片进入稀疏和乱序的容器中。稀疏容器会降低还原性能和垃圾回收效率,而乱序容器会在还原缓存较小时降低还原性能。为了减少碎片,我们提出了历史感知重写算法(HAR)和缓存感知过滤器(CAF)。 HAR利用备份系统中的历史信息来准确识别和减少稀疏容器,而CAF利用还原缓存知识来识别损害还原性能的乱序容器。 CAF有效地补充了乱序容器占主导地位的数据集中的HAR。为了减少垃圾收集的元数据开销,我们进一步提出了一种容器标记算法(CMA),以识别有效容器而不是有效块。我们从实际数据集中获得的广泛实验结果表明,HAR显着提高了恢复性能2.84-175.36,而仅重写了0.5-2.03%的数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号