...
首页> 外文期刊>Jurnal RESTI: Rekayasa Sistem dan Teknologi Informasi >Analysis of the Effect of Data Scaling on the Performance of the Machine Learning Algorithm for Plant Identification:
【24h】

Analysis of the Effect of Data Scaling on the Performance of the Machine Learning Algorithm for Plant Identification:

机译:数据缩放对工厂识别机器学习算法性能的影响分析:

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Data scaling has an important role in preprocessing data that has an impact on the performance of machine learning algorithms. This study aims to analyze the effect of min-max normalization techniques and standardization (zero-mean normalization) on the performance of machine learning algorithms. The stages carried out in this study included data normalization on the data of leaf venation features. The results of the normalized dataset, then tested to four machine learning algorithms include KNN, Na?ve Bayesian, ANN, SVM with RBF kernels and linear kernels. The analysis was carried out on the results of model evaluations using 10-fold cross-validation, and validation using test data. The results obtained show that Na?ve Bayesian has the most stable performance against the use of min-max normalization techniques as well as standardization. The KNN algorithm is quite stable compared to SVM and ANN. However, the combination of the min-max normalization technique with SVM that uses the RBF kernel can provide the best performance results. On the other hand, SVM with a linear kernel, the best performance is obtained when applying standardization techniques (zero-mean normalization). While the ANN algorithm, it is necessary to do a number of trials to find out the best data normalization techniques that match the algorithm.
机译:数据缩放在预处理数据中具有重要作用,该数据对机器学习算法的性能产生影响。本研究旨在分析最大批量化技术和标准化(零均衡)对机器学习算法性能的影响。本研究中执行的阶段包括叶瓦纳特征数据的数据标准化。归一化数据集的结果,然后测试到四台机器学习算法,包括KNN,NA贝塞,ANN,带RBF内核和线性核的SVM。使用10倍交叉验证的模型评估结果进行分析,并使用测试数据验证。获得的结果表明,Na ve贝叶斯对使用最大标准化技术以及标准化具有最稳定的性能。与SVM和ANN相比,KNN算法非常稳定。但是,使用RBF内核的SVM的MIN-MAX归一化技术的组合可以提供最佳的性能结果。另一方面,使用线性内核的SVM,在应用标准化技术时获得最佳性能(零平均归一化)。虽然ANN算法,有必要做一些试验,以找出与算法匹配的最佳数据归一化技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号