首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Improving Restore Performance of Packed Datasets in Deduplication Systems via Reducing Persistent Fragmented Chunks
【24h】

Improving Restore Performance of Packed Datasets in Deduplication Systems via Reducing Persistent Fragmented Chunks

机译:通过减少持久的碎片块来提高重复数据删除系统中打包数据集的还原性能

获取原文
获取原文并翻译 | 示例

摘要

Data deduplication, though being efficient for redundancy elimination in storage systems, introduces chunk fragmentation which severely decreases restore performance. Rewriting algorithms are proposed to reduce the chunk fragmentation. Typically, the backup software aggregates files into larger "tar" type files for storage. We observe that, in tar type datasets, a large number of Persistent Fragmented Chunks (PFCs) are repeatedly rewritten by state-of-the-art rewriting algorithms in every backup, which severely impacts restore performance. We found that the existence of PFCs is due to the traditional strategy of storing PFCs along with other chunks in the containers to preserve the stream locality, rendering them always stored in the containers with low utilization. We propose DePFC to reduce PFCs. DePFC identifies and removes PFCs from the containers preserving the stream locality, and groups them together, to increase the utilization of containers holding them for the subsequent backup, thus preventing them from being rewritten again. We further propose an FC Buffer to avoid mistaken rewrites of PFCs and grouping PFCs that cause restore cache thrashing together. Experimental results demonstrate that DePFC improves restore performance of state-of-the-art rewriting algorithms by 44.24-89.42 percent, while attaining comparable deduplication efficiency, and FC Buffer further improves restore performance.
机译:重复数据删除虽然可以有效消除存储系统中的冗余,但会引入大块碎片,从而严重降低还原性能。提出了重写算法以减少块碎片。通常,备份软件将文件聚合为较大的“ tar”类型文件以进行存储。我们观察到,在tar类型数据集中,每个备份中都使用最新的重写算法反复重写大量的持久碎片块(PFC),这严重影响了恢复性能。我们发现PFC的存在是由于将PFC与其他块一起存储在容器中以保持流局部性的传统策略所致,从而使它们始终以较低的利用率存储在容器中。我们建议使用DePFC来减少PFC。 DePFC会从保留流位置的容器中识别并删除PFC,并将它们组合在一起,以提高容纳它们的容器的利用率,以进行后续备份,从而防止再次重写它们。我们进一步建议使用FC缓冲区,以避免对PFC进行错误的重写和对PFC进行分组,以免导致恢复缓存崩溃。实验结果表明,DePFC将最先进的重写算法的还原性能提高了44.24-89.42%,同时达到了可比的重复数据删除效率,而FC Buffer进一步提高了还原性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号