...
首页> 外文期刊>Stochastic environmental research and risk assessment >Implementation of free and open-source semi-automatic feature engineering tool in landslide susceptibility mapping using the machine-learning algorithms RF, SVM, and XGBoost
【24h】

Implementation of free and open-source semi-automatic feature engineering tool in landslide susceptibility mapping using the machine-learning algorithms RF, SVM, and XGBoost

机译:Implementation of free and open-source semi-automatic feature engineering tool in landslide susceptibility mapping using the machine-learning algorithms RF, SVM, and XGBoost

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Abstract Various machine learning (ML) techniques have been recommended and used in the literature to produce landslide susceptibility map (LSM). On the other hand, feature engineering (FE) is an important topic in ML studies, but the concept is ignored by most research. In this study, a novel FE framework, including feature selection, feature transformation, feature binning, and feature weighting, is proposed to produce LSMs using eXtreme gradient boosting (XGBoost), random forest (RF), and support vector machine (SVM). For this purpose, first, thirteen landslide conditioning factors used in data preprocessing were utilized for producing LSM models in the study area, Babadag district of Denizli Province in the Aegean region of Turkey. Second, two irrelevant factors eliminated from the input feature subset using the feature selection in the FE framework. Third, features determined as skewed data were converted into symmetric form by applying feature transformation analysis with log transformation. Then, the remaining factors having continuous values were turned into categorical values using the quantile classifier technique. During the feature weighting phase, four different feature weighting methods, namely, eXtreme Gradient Boosting, random forest (RF), non-negative least squares (NNLS), and Frequency Ratio, were utilized to calculate the weights in each subclass of each landslide-related factor. In addition, the proposed feature subsets were also compared with raw data. At the end of process, the XGBoost model constructed with a FR-selected subset (Overall Accuracy (Acc)?=?0.907 and area under curve (AUC)?=?0.9822) outperformed both raw (Acc?=?0.874; AUC?=?0.960) and other methods (i.e., RF–FR and SVM–NNLS). Consequently, the study results revealed that the proposed FE approach could be a useful framework to increase the performance of ML techniques in identifying and extracting relevant features to develop highly optimized and enriched models.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号