首页> 外文期刊>Journal of Intelligent Manufacturing >Identifying maximum imbalance in datasets for fault diagnosis of gearboxes
【24h】

Identifying maximum imbalance in datasets for fault diagnosis of gearboxes

机译:识别数据集中的最大不平衡,用于齿轮箱的故障诊断

获取原文
获取原文并翻译 | 示例
           

摘要

Research into fault diagnosis in rotating machinery with a wide range of variable loads and speeds, such as the gearboxes of wind turbines, is of great industrial interest. Although appropriate sensors have been identified, an intelligent system that classifies machine states remains an open issue, due to a paucity of datasets with sufficient fault cases. Many of the proposed solutions have been tested on balanced datasets, containing roughly equal percentages of wind-turbine failure instances and instances of correct performance. In practice, however, it is not possible to obtain balanced datasets under real operating conditions. Our objective is to identify the most suitable classification technique that will depend least of all on the level of imbalance in the dataset. We start by analysing different metrics for the comparison of classification techniques on imbalanced datasets. Our results pointed to the Unweighted Macro Average of the F-measure, which we consider the most suitable metric for this diagnosis. Then, an extensive set of classification techniques was tested on datasets with varying levels of imbalance. Our conclusion is that a Rotation Forest ensemble of C4.4 decision trees, modifying the training phase of the classifier with a cost-sensitive approach, is the most suitable prediction model for this industrial task. It maintained its good performance even when the minority classes rate was as low as 6.5 %, while the majority of the other classifiers were more sensitive to the level of database imbalance and failed standard performance objectives, when the minority classes rate was lower than 10.5 %.
机译:旋转机械故障诊断具有广泛的可变载荷和速度,例如风力涡轮机的齿轮箱,具有很大的工业利益。虽然已经识别了适当的传感器,但由于具有足够故障情况的数据集的缺乏,智能系统仍然是一个开放问题。许多提出的解决方案已经在平衡数据集上进行了测试,其中含有大致相等的风力涡轮机故障实例和正确性能的情况。然而,在实践中,在实际操作条件下无法获得平衡数据集。我们的目标是识别最合适的分类技术,这些技术将至少取决于数据集中的不平衡水平。我们首先分析不同的指标,以便比较不平衡数据集的分类技术。我们的结果指出了F-Peaces的未加权宏观平均值,我们认为这种诊断最合适的指标。然后,在具有不同级别的不平衡水平的数据集上测试了广泛的一系列分类技术。我们的结论是,C4.4决策树的旋转林集合,用成本敏感的方法修改分类器的训练阶段,是该工业任务最适合的预测模型。即使少数竞争率低至6.5%,它也保持了良好的性能,而当少数阶层率低于10.5%时,其他分类器的大多数其他分类器对数据库不平衡水平更敏感。 。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号