首页> 外文会议>IEEE International Conference on Data Mining >Layerwise Perturbation-Based Adversarial Training for Hard Drive Health Degree Prediction
【24h】

Layerwise Perturbation-Based Adversarial Training for Hard Drive Health Degree Prediction

机译:基于扰动的层次扰动的抗扰动性健康程度预测

获取原文

摘要

With the development of cloud computing and big data, the reliability of data storage systems becomes increasingly important. Previous researchers have shown that machine learning algorithms based on SMART attributes are effective methods to predict hard drive failures. In this paper, we use SMART attributes to predict hard drive health degrees which are helpful for taking different fault tolerant actions in advance. Given the highly imbalanced SMART datasets, it is a nontrivial work to predict the health degree precisely. The proposed model would encounter overfitting and biased fitting problems if it is trained by the traditional methods. In order to resolve this problem, we propose two strategies to better utilize imbalanced data and improve performance. Firstly, we design a layerwise perturbation-based adversarial training method which can add perturbations to any layers of a neural network to improve the generalization of the network. Secondly, we extend the training method to the semi-supervised settings. Then, it is possible to utilize unlabeled data that have a potential of failure to further improve the performance of the model. Our extensive experiments on two real-world hard drive datasets demonstrate the superiority of the proposed schemes for both supervised and semi-supervised classification. The model trained by the proposed method can correctly predict the hard drive health status 5 and 15 days in advance.
机译:随着云计算和大数据的发展,数据存储系统的可靠性变得越来越重要。以前的研究人员已经表明,基于智能属性的机器学习算法是预测硬盘驱动器故障的有效方法。在本文中,我们使用智能属性来预测硬盘运行状况,这有助于提前采取不同的容错动作。鉴于高度不平衡的智能数据集,这是一种绝对的工作,可以准确地预测健康程度。如果通过传统方法训练,所提出的模型将遇到过度拟合和偏置的拟合问题。为了解决这个问题,我们提出了两种策略,以更好地利用不平衡数据并提高性能。首先,我们设计了一种基于层扰动的对抗训练方法,其可以向神经网络的任何层添加扰动以改善网络的泛化。其次,我们将培训方法扩展到半监督设置。然后,可以利用具有失败潜力的未标记数据来进一步提高模型性能。我们对两个现实世界硬盘数据集的广泛实验证明了监督和半监督分类的提议方案的优势。所提出的方法训练的模型可以预先正确预测硬盘运行状况5和15天。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号