Data Mining Exploration System for Feature Selection Tasks

机译：数据挖掘探索系统，用于特征选择任务

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The area of data mining and knowledge discovery is inherently associated with databases. Data mining methods are used in the process of knowledge discovery to reveal new pieces of knowledge from large databases. One of the stages in that process is a feature selection. A feature selection is usually meant as a process of finding a subset of features from the original set of features forming patterns in a given data set, optimal according to the defined goal and criterion of feature selection. The aim of this paper is to present the main functionalities of the Data Mining Exploration System (DMES) [4] and to explain its usefulness to the feature selection tasks. The DMES is an integrated software system that incorporates many algorithms which can be used in data mining. It is currently being developed in the University of Rzeszow. We describe in short its recent version (1.2) and the algorithms that had already been implemented. The DMES allows to visualize, split, preprocess, analyze, classify and reduce decision tables. To show one of the possibilities provided by the DMES system, we used it for the feature selection [3], [7], [8], [10], [12] task. Feature selection is performed by the most recent addition to the DMES system which are RBFS1, RBFS2 and ARS algorithms [14]. These feature selection methods are designed mainly for the multiple classifiers systems with homogeneous classifiers [2], [5], [6]. Homogeneous classifiers require many different subsets of the data set. The problem of finding the best subsets of a given feature set is exponentially complex. We use RBFS (Reduct Based Feature Selection) algorithm [15] and its two modifications called RBFS1 and RBFS2 to select optimal subsets of the feature set for multiple classifiers. RBFS algorithms are quite complex computationally because they use all decision-relative reducts [11] of a given decision table.

机译：数据挖掘和知识发现的区域本质上与数据库相关联。数据挖掘方法用于知识发现过程，以揭示大型数据库的新知识。该过程中的一个阶段是特征选择。特征选择通常是指在给定数据集中从原始特征集形成图案的特征的子集的过程，根据特征选择的定义目标和标准。本文的目的是介绍数据挖掘勘探系统（DMES）[4]的主要功能，并向特征选择任务解释其有用性。 DMES是一个集成的软件系统，它包含了许多可用于数据挖掘的算法。它目前正在Rzeszow大学开发。我们在最近的近期版本（1.2）和已经实施的算法中描述。 DME允许可视化，拆分，预处理，分析，分类和减少决策表。要显示DMES系统提供的可能性之一，我们将其用于特征选择[3]，[7]，[8]，[10]，[12]任务。特征选择是通过最新的DMES系统的添加，即RBFS1，RBFS2和ARS算法[14]。这些特征选择方法主要针对具有均相分类器的多个分类器系统[2]，[5]，[6]。同质分类器需要数据集的许多不同子集。找到给定功能集的最佳子集的问题是指数复杂的。我们使用RBFS（基于的特征选择）算法[15]及其两个名为RBFS1和RBFS2的两个修改，以选择用于多个分类器的功能集的最佳子集。 RBFS算法计算得非常复杂，因为它们使用给定决策表的所有决策相对减少[11]。

著录项

来源
《International Conference on Hybrid Information Technology》|2006年||共3页
会议地点
作者
Zbigniew Suraj; Pawel Delimata;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 G202-53;
关键词

相似文献

外文文献
中文文献
专利

1. On Taxonomy and Evaluation of Feature Selection-Based Learning Classifier System Ensemble Approaches for Data Mining Problems [J] . Debie Essam, Shafi Kamran, Merrick Kathryn, Computational Intelligence . 2017,第3期

机译：基于特征选择的学习分类器系统集成方法的数据挖掘问题分类与评价
2. Relevant Feature Selection Model Using Data Mining for Intrusion Detection System [J] . Ayman I. Madbouly, Amr M. Gody, Tamer M. Barakat International Journal of Engineering Trends and Technology . 2014,第10期

机译：基于数据挖掘的入侵检测系统相关特征选择模型
3. Fault diagnosis on material handling system using feature selection and data mining techniques [J] . M. Demetgul, K. Yildiz, S. Taskin, Measurement . 2014,第Null期

机译：基于特征选择和数据挖掘技术的物料搬运系统故障诊断
4. Data Mining Exploration System for Feature Selection Tasks [C] . Zbigniew Suraj, Pawel Delimata International Conference on Hybrid Information Technology . 2006

机译：数据挖掘探索系统，用于特征选择任务
5. Unsupervised data mining methods for functional data analysis and feature selection. [D] . Rattakorn, Panaya. 2009

机译：用于功能数据分析和特征选择的无监督数据挖掘方法。
6. Visual Systems for Interactive Exploration and Mining of Large-Scale Neuroimaging Data Archives [O] . Ian Bowman, Shantanu H. Joshi, John D. Van Horn 2012

机译：可视化系统用于交互式探索和挖掘大型神经影像数据档案
7. Progressive data mining: An Exploration of using whole-dataset feature selection in building classifiers on three biological problems [O] . VIJAYARAGHAVA SESHADRI SUNDARARAJAN 2008

机译：渐进式数据挖掘：在构建针对三个生物学问题的分类器中使用全数据集特征选择的探索
8. Data Mining Feature Subset Weighting and Selection Using Genetic Algorithms [R] . 2002

机译：基于遗传算法的数据挖掘特征子集加权和选择

Data Mining Exploration System for Feature Selection Tasks

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅