首页> 外文会议>ACM/IEEE Design Automation Conference >Tier-Scrubbing: An Adaptive and Tiered Disk Scrubbing Scheme with Improved MTTD and Reduced Cost
【24h】

Tier-Scrubbing: An Adaptive and Tiered Disk Scrubbing Scheme with Improved MTTD and Reduced Cost

机译:分层清理:具有改进的MTTD和降低的成本的自适应分层磁盘清理方案

获取原文

摘要

Sector errors are a common type of error in modern disks. A sector error that occurs during I/O operations might cause inaccessibility of an application. Even worse, it could result in permanent data loss if the data is being reconstructed, and thereby severely affects the reliability of a storage system. Many disk scrubbing schemes have been proposed to solve this problem. However, existing approaches have several limitations. First, schemes use machine learning (ML) to predict latent sector errors (LSEs), but only leverage a single snapshot of training data to make a prediction, and thereby ignore sequential dependencies between different statuses of a hard disk over time. Second, they accelerate the scrubbing at a fixed rate based on the results of a binary classification model, which may result in unnecessary increases in scrubbing cost. Third, they naively accelerate the scrubbing of the full disk which has LSEs based on the predictive results, but neglect partial high-risk areas (the areas that have a higher probability of encountering LSEs). Lastly, they do not employ strategies to scrub these high-risk areas in advance based on I/O accesses patterns, in order to further increase the efficiency of scrubbing.We address these challenges by designing a Tier-Scrubbing (TS) scheme that combines a Long Short-Term Memory (LSTM) based Adaptive Scrubbing Rate Controller (ASRC), a module focusing on sector error locality to locate high-risk areas in a disk, and a piggyback scrubbing strategy to improve the reliability of a storage system. Our evaluation results on realistic datasets and workloads from two real world data centers demonstrate that TS can simultaneously decrease the Mean-Time-To-Detection (MTTD) by about 80% and the scrubbing cost by 20%, compared to a state-of-the-art scrubbing scheme.
机译:扇区错误是现代磁盘中常见的错误类型。在I / O操作期间发生的扇区错误可能会导致应用程序无法访问。更糟糕的是,如果正在重建数据,则可能会导致永久性的数据丢失,从而严重影响存储系统的可靠性。已经提出了许多磁盘清理方案来解决该问题。但是,现有方法具有一些局限性。首先,方案使用机器学习(ML)来预测潜在扇区错误(LSE),但仅利用训练数据的单个快照进行预测,从而随时间推移忽略硬盘不同状态之间的顺序依赖性。其次,它们基于二元分类模型的结果以固定的速率加速洗涤,这可能会导致不必要的洗涤成本增加。第三,他们根据预测结果天真地加快了具有LSE的整个磁盘的清理速度,但忽略了部分高风险区域(遇到LSE的可能性较高的区域)。最后,他们没有采用基于I / O访问模式预先清理这些高风险区域的策略,以进一步提高清理效率。一个基于长短期内存(LSTM)的自适应清理速率控制器(ASRC),一个专注于扇区错误局部性以定位磁盘中高风险区域的模块,以及一种背负清理策略来提高存储系统的可靠性。我们对来自两个现实世界数据中心的真实数据集和工作负载的评估结果表明,与状态为“状态”相比,TS可以同时将平均检测时间(MTTD)降低约80%,将清理成本降低20%最先进的洗涤方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号