Reducing Fragmentation for In-line Deduplication Backup Storage via Exploiting Backup History and Cache Knowledge

Fu Min; Feng Dan; Hua Yu; He Xubin; Chen Zuoning; Liu Jingning; Xia Wen; Huang Fangting; Liu Qing

首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Reducing Fragmentation for In-line Deduplication Backup Storage via Exploiting Backup History and Cache Knowledge

【24h】

Reducing Fragmentation for In-line Deduplication Backup Storage via Exploiting Backup History and Cache Knowledge

机译：通过利用备份历史记录和缓存知识减少在线重复数据删除备份存储的碎片

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In backup systems, the chunks of each backup are physically scattered after deduplication, which causes a challenging fragmentation problem. We observe that the fragmentation comes into sparse and out-of-order containers. The sparse container decreases restore performance and garbage collection efficiency, while the out-of-order container decreases restore performance if the restore cache is small. In order to reduce the fragmentation, we propose History-Aware Rewriting algorithm (HAR) and Cache-Aware Filter (CAF). HAR exploits historical information in backup systems to accurately identify and reduce sparse containers, and CAF exploits restore cache knowledge to identify the out-of-order containers that hurt restore performance. CAF efficiently complements HAR in datasets where out-of-order containers are dominant. To reduce the metadata overhead of the garbage collection, we further propose a Container-Marker Algorithm (CMA) to identify valid containers instead of valid chunks. Our extensive experimental results from real-world datasets show HAR significantly improves the restore performance by 2.84-175.36 at a cost of only rewriting 0.5-2.03 percent data.

机译：在备份系统中，每个备份的大块在重复数据删除后会在物理上分散开来，这会带来极具挑战性的碎片问题。我们观察到碎片进入稀疏和乱序的容器中。稀疏容器会降低还原性能和垃圾回收效率，而乱序容器会在还原缓存较小时降低还原性能。为了减少碎片，我们提出了历史感知重写算法（HAR）和缓存感知过滤器（CAF）。 HAR利用备份系统中的历史信息来准确识别和减少稀疏容器，而CAF利用还原缓存知识来识别损害还原性能的乱序容器。 CAF有效地补充了乱序容器占主导地位的数据集中的HAR。为了减少垃圾收集的元数据开销，我们进一步提出了一种容器标记算法（CMA），以识别有效容器而不是有效块。我们从实际数据集中获得的广泛实验结果表明，HAR显着提高了恢复性能2.84-175.36，而仅重写了0.5-2.03％的数据。

著录项

来源
《Parallel and Distributed Systems, IEEE Transactions on》 |2016年第3期|855-868|共14页
作者
Fu Min; Feng Dan; Hua Yu; He Xubin; Chen Zuoning; Liu Jingning; Xia Wen; Huang Fangting; Liu Qing;
展开▼
作者单位

Wuhan National Laboratory for Optoelectronics, School of Computer Science and Technology, Huazhong University of Science and Technology, Division of Data Storage System, Wuhan, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Data deduplication; chunk fragmentation; performance evaluation; storage system;

机译：重复数据删除;大块碎片;性能评估;存储系统;

相似文献

外文文献
中文文献
专利

1. A Lookahead Read Cache：Improving Read Performance for Deduplication Backup Storage [J] . Dongchul Park, Ziqi Fan, Young Jin Nam, 计算机科学技术学报（英文版） . 2017,第001期

机译：前瞻性读取缓存：提高重复数据删除备份存储的读取性能
2. Deduplication in the Backup System with Information Storage in a Database [J] . S. M. Taranin Automatic Control and Computer Sciences . 2018,第7期

机译：在数据库中具有信息存储的备份系统中的重复数据删除
3. A Fast Asymmetric Extremum Content Defined Chunking Algorithm for Data Deduplication in Backup Storage Systems [J] . Yucheng Zhang, Dan Feng, Hong Jiang, IEEE Transactions on Computers . 2017,第2期

机译：用于备份存储系统中重复数据删除的快速非对称极值内容定义分块算法
4. Cloud iDedup: History aware in-line Deduplication for cloud storage to reduce fragmentation by utilizing Cache Knowledge [C] . Reshma A. Fegade, R.D. Bharati 2016 International Conference on Computing, Analytics and Security Trends . 2016

机译：Cloud iDedup：具有历史记录的在线式重复数据删除技术，用于云存储，以利用缓存知识减少碎片
5. Efficient and secure deduplication for cloud-based backups. [D] . Wang, Yufeng. 2015

机译：针对基于云的备份的高效，安全的重复数据删除。
6. DOMe: A deduplication optimization method for the NewSQL database backups [O] . Longxiang Wang, Zhengdong Zhu, Xingjun Zhang, -1

机译：DOMe：NewSQL数据库备份的重复数据删除优化方法
7. A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique [O] . JYOTI MALHOTRA, PRIYA GHYARE 2014

机译：使用块指数缓存技术的云备份服务重复数据删除方法的新方法
8. Zoned Backup in Electric-Powered Heat-Pump Systems: A Way to Conserve Energy and Reduce Utility Peak Loads [R] . Andrews, J. W. 1986

机译：电动热泵系统中的分区备份：节约能源和降低电力峰值负荷的方法

Reducing Fragmentation for In-line Deduplication Backup Storage via Exploiting Backup History and Cache Knowledge

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅