Automatic KDD Data Preparation Using Multi-criteria Features

机译：使用多标准功能自动KDD数据准备

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a new approach for automatic data preparation, applicable in most Knowledge Discovery and Data Mining systems, and using statistical features of the studied database. First, we detect outliers using an approach based on whether data distribution is normal or not. We outline further that, when trying to find the most appropriate discretization method, what is important is not the law followed by a column, but the shape of its density function. That is why we propose an automatic choice for finding the best discretization method based on a multi-criteria (Entropy, Variance, Stability) analysis. Experimental evaluations validate our approach: The very same discretization method is never always the most appropriate.

机译：我们为自动数据准备提供了一种新的方法，适用于大多数知识发现和数据挖掘系统，并使用研究数据库的统计功能。首先，我们使用基于数据分布是正常的方法来检测异常值。我们进一步概述了，当试图找到最合适的离散化方法时，重要的是不是法律，然后是一列，而是其密度函数的形状。这就是为什么我们提出基于多标准（熵，方差，稳定性）分析来查找最佳离散化方法的自动选择。实验评估验证了我们的方法：非常相同的离散化方法永远不会总是最合适的。

著录项

来源
《International Conference on Advances in Information Mining and Management》|2015年||共6页
会议地点
作者
Youssef Hmamouche; Christian Ernst; Alain Casali;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 G20-53;
关键词
Data Mining; Data Preparation; Outliers; Discretization Methods;

机译：数据挖掘;数据准备;异常值;离散方式;

相似文献

外文文献
中文文献
专利

1. Data preparation for KDD through automatic reasoning based on description logic [J] . Juan A. Lara, David Lizcano, M~a. Aurora Martinez, Information Systems . 2014,第auga期

机译：基于描述逻辑的自动推理的KDD数据准备
2. Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset [J] . V. Bolon-Canedo, N. Sanchez-Marono, A. Alonso-Betanzos Expert Systems with Application . 2011,第5期

机译：多类数据集中的特征选择和分类：在KDD Cup 99数据集中的应用
3. Applying Variable Coe_cient functions to Self-Organizing Feature Maps for Network Intrusion Detection on the 1999 KDD Cup Dataset [J] . Charlie Obimbo, Matthew Jones Procedia Computer Science . 2012,第1期

机译：将可变Coe_cient函数应用于自组织特征图以进行1999 KDD Cup数据集的网络入侵检测
4. Automatic KDD Data Preparation Using Multi-criteria Features [C] . Youssef Hmamouche, Christian Ernst, Alain Casali International Conference on Advances in Information Mining and Management . 2015

机译：使用多标准功能自动KDD数据准备
5. Rough set approach to feature reduction in KDD: Evolutionary computing and data sampling. [D] . Rahman, Mohammad Mahibour. 2006

机译：减少KDD中的特征的粗糙集方法：进化计算和数据采样。
6. Achieving Accurate Automatic Sleep Staging on Manually Pre-processed EEG Data Through Synchronization Feature Extraction and Graph Metrics [O] . Panteleimon Chriskos, Christos A. Frantzidis, Polyxeni T. Gkivogkli, 2018

机译：通过同步特征提取和图形指标在手动预处理的EEG数据上实现准确的自动睡眠分级
7. Feature Selection in UNSW-NB15 and KDDCUP’99 datasets [O] . Janarthanan Tharmini, Zargari Shahrzad 2017

机译：UNSW-NB15和KDDCUP’99数据集中的特征选择

Automatic KDD Data Preparation Using Multi-criteria Features

摘要

著录项

相似文献

相关主题

期刊订阅