首页> 外文期刊>Advanced engineering informatics >Ensemble data mining modeling in corrosion of concrete sewer: A comparative study of network-based (MLPNN & RBFNN) and tree-based (RF, CHAID, & CART) models
【24h】

Ensemble data mining modeling in corrosion of concrete sewer: A comparative study of network-based (MLPNN & RBFNN) and tree-based (RF, CHAID, & CART) models

机译:混凝土污水管道腐蚀中的集合数据挖掘建模:基于网络的模型(MLPNN和RBFNN)和基于树的模型(RF,CHAID和CART)的比较研究

获取原文
获取原文并翻译 | 示例

摘要

This research aims to evaluate ensemble learning (bagging, boosting, and modified bagging) potential in predicting microbially induced concrete corrosion in sewer systems from the data mining (DM) perspective. Particular focus is laid on ensemble techniques for network-based DM methods, including multi-layer perception neural network (MLPNN) and radial basis function neural network (RBFNN) as well as tree-based DM methods, such as chi-square automatic interaction detector (CHAID), classification and regression tree (CART), and random forests (RF). Hence, an interdisciplinary approach is presented by combining findings from material sciences and hydrochemistry as well as data mining analyses to predict concrete corrosion. The effective factors on concrete corrosion such as time, gas temperature, gas-phase H_2S concentration, relative humidity, pH, and exposure phase are considered as the models' inputs. All 433 datasets are randomly selected to construct an individual model and twenty component models of boosting, bagging, and modified bagging based on training, validating, and testing for each DM base learners. Considering some model performance indices, (e.g., Root mean square error, RMSE; mean absolute percentage error, MAPE; correlation coefficient, r) the best ensemble predictive models are selected. The results obtained indicate that the prediction ability of the random forests DM model is superior to the other ensemble learners, followed by the ensemble Bag-CHAID method. On average, the ensemble tree-based models acted better than the ensemble network-based models; nevertheless, it was also found that taking the advantages of ensemble learning would enhance the general performance of individual DM models by more than 10%.
机译:这项研究旨在从数据挖掘(DM)角度评估集成学习(装袋,加强装袋和改良装袋)潜力,以预测下水道系统中微生物引起的混凝土腐蚀。特别关注基于网络的DM方法的集成技术,包括多层感知神经网络(MLPNN)和径向基函数神经网络(RBFNN)以及基于树的DM方法,例如卡方自动交互检测器(CHAID),分类和回归树(CART)和随机森林(RF)。因此,通过结合材料科学和水化学的发现以及数据挖掘分析来预测混凝土腐蚀,提出了一种跨学科的方法。时间,气体温度,气相H_2S浓度,相对湿度,pH和暴露阶段等对混凝土腐蚀的有效因素被视为模型的输入。基于对每个DM基础学习者的训练,验证和测试,随机选择所有433个数据集以构建一个单独的模型和二十个增强,装袋和修改装袋的组件模型。考虑到一些模型性能指标(例如,均方根均方根误差(RMSE);平均绝对百分误差(MAPE);相关系数r),选择了最佳的整体预测模型。所得结果表明,随机森林DM模型的预测能力优于其他集成学习者,其次是集成Bag-CHAID方法。平均而言,基于集合树的模型比基于集合网络的模型表现更好;但是,还发现利用集成学习的优势可以将单个DM模型的总体性能提高10%以上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号