Comparison of the performance of multiclass classifiers in chemical data: Addressing the problem of overfitting with the permutation test

首页> 外文期刊>Chemometrics and Intelligent Laboratory Systems >Comparison of the performance of multiclass classifiers in chemical data: Addressing the problem of overfitting with the permutation test

【24h】

Comparison of the performance of multiclass classifiers in chemical data: Addressing the problem of overfitting with the permutation test

机译：化学数据中的多字母分类器性能的比较：解决置换测试的过度问题问题

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The objective of this work was to apply different pattern recognition techniques in datasets-i.e., the Glass Identification Dataset and the Wine Quality Dataset-commonly used as a chemometric study of cases. In this paper, three types of different classification models were used. The first type was based on discriminant analysis and other linear classification models such as Linear Discriminant Analysis (LDA), Regularized Discriminant Analysis (RDA), Mixture Discriminant Analysis (MDA), and Partial Least Squares Discriminant Analysis (PLS-DA). The second type was based on nonlinear classification models such as Artificial Neural Networks (ANN), Support Vector Machine (SVM) with a radial kernel function, k-Nearest Neighbors (k-NN), Naive Bayes (NB), and Learning Vector Quantization (LVQ). The last type was based on classification trees and rule-based models such as Classification and Regression Tree (CART), Bagging, Random Forest (RF), C5.0, and Generalized Boosted Machine (GBM). The obtained results outperformed the classification concerning works previously published in the literature. The computational experiments show that the LVQ was the one method able to classify all three datasets correctly. The permutation tests were applied to evaluate the occurrences of the overfitting problem. The results showed that the overfitting problem was absent, which was confirmed by the pairwise Wilcoxon signed-rank test.

机译：这项工作的目的是在数据集-1.E中应用不同的模式识别技术。，玻璃识别数据集和葡萄酒质量数据集 - 常用为对病例的化学计量研究。在本文中，使用了三种类型的不同分类模型。第一类基于判别分析和其他线性分类模型，例如线性判别分析（LDA），正则判别分析（RDA），混合判别分析（MDA）和局部最小二乘判别分析（PLS-DA）。第二种类型基于非线性分类模型，例如人工神经网络（ANN），支持向量机（SVM），带有径向内核函数，K-CORMONT邻居（K-NN），幼稚贝叶斯（NB）和学习矢量量化（LVQ）。最后一次类型基于分类树和基于规则的模型，如分类和回归树（推车），袋装，随机森林（RF），C5.0和广义提升机（GBM）。获得的结果优于先前在文献中发表的作品的分类。计算实验表明LVQ是能够正确分类所有三个数据集的方法。应用置换测试来评估过度拟合问题的发生。结果表明，不存在过烧点的问题，其通过成对毒素签名 - 秩检验证实。

著录项

来源
《Chemometrics and Intelligent Laboratory Systems》 |2020年第2020期|共7页
作者

展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计量学;
关键词
Pattern recognition; Glass; Wine; Overfitting; Permutation test;

机译：模式识别;玻璃;葡萄酒;过度装备;排列测试;

相似文献

外文文献
中文文献
专利

1. Comparison of the performance of multiclass classifiers in chemical data: Addressing the problem of overfitting with the permutation test [J] . Chemometrics and Intelligent Laboratory Systems . 2020,第期

机译：化学数据中的多字母分类器性能的比较：解决置换测试的过度问题问题
2. A Comparison of Permutation Hotelling's T~2 Test and Log-Ratio Test for Analyzing Compositional Data [J] . DEO KUMAR SRIVASTAVA, JAMES M. BOYETT, CARL W. JACKSON, Communications in Statistics. A, Theory and Methods . 2007,第1a4期

机译：排列Hotelling的T〜2检验和对数比检验用于分析成分数据的比较
3. Efficiency comparisons of rank and permutation tests based on summary statistics computed from repeated measures data. [J] . Weinberg JM u.edu, Lagakos SW Statistics in medicine . 2001,第5期

机译：基于从重复测量数据计算得出的摘要统计量的等级和置换测试的效率比较。
4. Permutation Tests for Studying Classifier Performance [C] . Ojala Markus, Garriga Gemma C. Data Mining, 2009. ICDM '09 . 2009

机译：用于研究分类器性能的置换测试
5. Biomolecular feature selection of colorectal cancer microarray data using GA-SVM hybrid and noise perturbation to address overfitting. [D] . Mizaku, Alda. 2009

机译：使用GA-SVM杂交技术和噪声扰动解决结直肠癌的大肠癌微阵列数据的生物分子特征选择。
6. Evaluation of Classifier Performance for Multiclass Phenotype Discrimination in Untargeted Metabolomics [O] . Patrick J. Trainor, Andrew P. DeFilippis, Shesh N. Rai 2017

机译：非分类代谢组学中多类表型识别的分类器性能评估
7. A case of study about overfitting in multiclass classifiers using Convolutional Neural Networks [O] . Yanexis Toledo, Thais Almeida, Flavia Bernardini, 2019

机译：一种使用卷积神经网络在多牌分类器中过用的研究
8. TEST RESULTS DL-S-229 T-641305 COMPARISON OF FEDAL MONITOR READINGS WITH RADIOCHEMICAL SAMPLE DATA SECTION I SECOND PERFORMANCE CORE I - SEED I [R] . 1959

机译：测试结果DL-s-229 T-641305与放射化学样品数据进行比较的FEDaL监测读数第I部分第二性能核心I - 种子I

Comparison of the performance of multiclass classifiers in chemical data: Addressing the problem of overfitting with the permutation test

摘要

著录项

相似文献

相关主题

期刊订阅