An automatic extraction method of the domains of competence for learning classifiers using data complexity measures

Luengo Julian; Herrera Francisco

首页> 外文期刊>Knowledge and information systems >An automatic extraction method of the domains of competence for learning classifiers using data complexity measures

【24h】

An automatic extraction method of the domains of competence for learning classifiers using data complexity measures

机译：使用数据复杂性度量的学习分类器胜任力领域的自动提取方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The constant appearance of algorithms and problems in data mining makes impossible to know in advance whether the model will perform well or poorly until it is applied, which can be costly. It would be useful to have a procedure that indicates, prior to the application of the learning algorithm and without needing a comparison with other methods, whether the outcome will be good or bad using the information available in the data. In this work, we present an automatic extraction method to determine the domains of competence of a classifier using a set of data complexity measures proposed for the task of classification. These domains codify the characteristics of the problems that are suitable or not for it, relating the concepts of data geometrical structures that may be difficult and the final accuracy obtained by any classifier. In order to do so, this proposal uses 12 metrics of data complexity acting over a large benchmark of datasets in order to analyze the behavior patterns of the method, obtaining intervals of data complexity measures with good or bad performance. As a representative for classifiers to analyze the proposal, three classical but different algorithms are used: C4.5, SVM and K-NN. From these intervals, two simple rules that describe the good or bad behaviors of the classifiers mentioned each are obtained, allowing the user to characterize the response quality of the methods from a dataset's complexity. These two rules have been validated using fresh problems, showing that they are general and accurate. Thus, it can be established when the classifier will perform well or poorly prior to its application.

机译：算法的不断出现和数据挖掘中的问题使得无法事先知道该模型在应用之前是否会表现良好或较差，这可能会导致成本高昂。在应用学习算法之前，无需使用其他方法进行比较即可使用数据中可用的信息指示结果是好是坏，这将是很有用的。在这项工作中，我们提出了一种自动提取方法，该方法使用为分类任务建议的一组数据复杂性度量来确定分类器的能力范围。这些领域将适合或不适合它的问题的特征进行了整理，与可能难以实现的数据几何结构的概念以及任何分类器获得的最终准确性相关。为了做到这一点，该提议使用了在较大的数据集基准上起作用的12个数据复杂性度量，以便分析该方法的行为模式，获得具有好坏性能的数据复杂性度量的间隔。作为分类器分析提案的代表，使用了三种经典但不同的算法：C4.5，SVM和K-NN。从这些时间间隔中，获得了两个简单的规则，分别描述了所提到的分类器的好坏行为，从而使用户可以根据数据集的复杂性来表征方法的响应质量。这两个规则已使用新问题进行了验证，表明它们是通用且准确的。因此，可以确定何时分类器在其应用之前将表现良好或不良。

著录项

来源
《Knowledge and information systems》 |2015年第1期|共34页
作者
Luengo Julian; Herrera Francisco;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动信息理论;
关键词
Classification; Data complexity; Domains of competence; C4.5; Support vector machines; K-nearest neighbor;

机译：分类;数据复杂度;能力域;C4.5;支持向量机;K近邻;
入库时间 2022-08-18 12:07:05

相似文献

外文文献
中文文献
专利

1. An automatic extraction method of the domains of competence for learning classifiers using data complexity measures [J] . Luengo Julian, Herrera Francisco Knowledge and information systems . 2015,第1期

机译：使用数据复杂性度量的学习分类器胜任力领域的自动提取方法
2. Automatic matrix-based analysis method for extraction of optical fiber parameters from polarimetric optical time domain reflectometry data [J] . Ellison J.G., Siddiqui A.S. Journal of Lightwave Technology . 2000,第9期

机译：基于矩阵的自动分析方法，用于从偏振光时域反射仪数据中提取光纤参数
3. Research on the Automatic Extraction Method of Web Data Objects Based on Deep Learning [J] . Peng Hao, Li Qiao Intelligent automation and soft computing . 2020,第3期

机译：基于深度学习的Web数据对象自动提取方法研究
4. Fuzzy Inference Methods Applied to the Learning Competence Measure in Dynamic Classifier Selection [C] . Kurzynski Marek, Krysmann Maciej SIBGRAPI Conference on Graphics, Patterns and Images . 2014

机译：动态分类器选择中学习能力测度的模糊推理方法
5. An automatic method for classifying medical researchers into domain specific subgroups. [D] . Cecchetti, Alfred A. 2009

机译：一种将医学研究人员分类为特定领域子组的自动方法。
6. A heuristic method for simulating open-data of arbitrary complexity that can be used to compare and evaluate machine learning methods [O] . Jason H. Moore, Maksim Shestov, Peter Schmitt, -1

机译：一种用于模拟任意复杂度的开放数据的启发式方法可用于比较和评估机器学习方法
7. An Extraction Method for the Characterization of the Fuzzy Rule Based Classification Systems ’ Behavior using Data Complexity Measures: A case of study with FH-GBML [O] . Julián Luengo, Francisco Herrera 2014

机译：基于模糊规则的分类系统行为特征提取方法的数据复杂度测度 - 以FH-GBmL为例
8. KI-LEARN: Knowledge-Intensive Learning Methods for Knowledge-Rich/Data- Poor Domains [R] . Dietterich, T. G. , Restificar, A. , Tadepalli, P. , 2006

机译：KI-LEaRN：知识丰富/数据贫乏领域的知识密集型学习方法

An automatic extraction method of the domains of competence for learning classifiers using data complexity measures

摘要

著录项

相似文献

相关主题

期刊订阅