首页> 外文会议>Mexican International Conference on Artificialntelligence >Feature Selection-Ranking Methods in a Very Large Electric Database
【24h】

Feature Selection-Ranking Methods in a Very Large Electric Database

机译:在一个非常大的电动数据库中的特征选择 - 排名方法

获取原文

摘要

Feature selection is a crucial activity when knowledge discovery is applied to very large databases, as it reduces dimensionality and therefore the complexity of the problem. Its main objective is to eliminate attributes to obtain a computationally tractable problem, without affecting the quality of the solution. To perform feature selection, several methods have been proposed, some of them tested over small academic datasets. In this paper we evaluate different feature selection-ranking methods over a very large real world database related with a Mexican electric energy client-invoice system. Most of the research on feature selection methods only evaluates accuracy and processing time; here we also report on the amount of discovered knowledge and stress the issue around the boundary that separates relevant and irrelevant features. The evaluation was done using Elvira and Weka tools, which integrate and implement state of the art data mining algorithms. Finally, we propose a promising feature selection heuristic based on the experiments performed.
机译:特征选择是当知识发现应用于非常大的数据库时的关键活动,因为它降低了维度,因此减少了问题的复杂性。其主要目标是消除属性以获得计算易遇问题,而不会影响解决方案的质量。为了执行特征选择,已经提出了几种方法,其中一些方法通过小型学术数据集进行了测试。在本文中,我们通过与墨西哥电能客户端 - 发票系统相关的非常大的现实世界数据库评估不同的特征选择级方法。大多数关于特征选择方法的研究只评估准确性和处理时间;在这里,我们还报告了发现的知识的数额,并在与相关和无关的功能分开的边界周围强调问题。评估是使用ELVIRA和WEKA工具进行的,该工具集成和实施了最先进的数据挖掘算法。最后,我们提出了一个基于实验的有前途的特征选择启发式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号