首页> 外文期刊>Bulletin of engineering geology and the environment >Application of classification coupled with PCA and SMOTE, for obtaining safety factor of landslide based on HRA
【24h】

Application of classification coupled with PCA and SMOTE, for obtaining safety factor of landslide based on HRA

机译:应用PCA和SMOTE相结合的分类,基于HRA获得滑坡安全系数

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Machine learning algorithms have been recently applied to build a landslide susceptibility map. The objective of this studyis to find whether classification algorithms of machine learning are suitable for obtaining safety factor based on a highrisk-area (HRA) model, composed of eight geotechnical properties. Each property value is designated as an input value formachine learning, and the output value is determined as a safety factor. The data are transformed into continuous data afterpreprocessing with label encoding since the data have a discontinuous pattern. The DT, KNN, LR, RF, and SVM algorithmsare selected to perform the classification with train and validation ratio of 7:3. To improve the reliability of the results,the classification is also performed after applying the PCA technique, which can reduce eight dimensions to two principalcomponents. In addition, the number of data is equally oversampled using the SMOTE technique to solve the data imbalanceproblem for each class, and the results of classification are also compared. The PCA shows a limited ability to reflectthe characteristics of the original data, and the oversampled data by the SMOTE provides high reliability. The results showthat the RF is suitable for performing classification with high accuracy in the range of 1.2-2.4 of safety factors. This studydemonstrates that it is possible to classify even discontinuous data through a preprocessing technique, and SMOTE canimprove the accuracy of landslide risk mapping.
机译:机器学习算法最近被应用于构建滑坡易发性地图。本研究的目的是发现机器学习的分类算法是否适合基于高风险区域(HRA)模型获得安全系数,该模型由8种岩土工程属性组成。每个属性值被指定为机器学习的输入值,输出值被确定为安全系数。由于数据具有不连续的模式,因此使用标签编码进行预处理后,数据将转换为连续数据。选择DT、KNN、LR、RF和SVM算法进行分类,训练和验证比为7:3。为了提高结果的可靠性,在应用PCA技术后进行分类,该技术可以将8个维度简化为2个主成分。此外,利用SMOTE技术对数据数量进行等过采样,求解各类的数据不平衡问题,并对分类结果进行比较。PCA反映原始数据特征的能力有限,SMOTE的过采样数据提供了高可靠性。结果表明,RF适合在1.2-2.4安全系数范围内进行高精度的分类。本研究表明,通过预处理技术甚至可以对不连续的数据进行分类,SMOTE可以提高滑坡风险制图的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号