首页> 外文会议>Data Mining and Knowledge Discovery: Theory, Tools, and Technology >Genetic programming system for building block analysis to enhance data analysis and data mining techniques
【24h】

Genetic programming system for building block analysis to enhance data analysis and data mining techniques

机译:用于构建块分析的遗传编程系统,以增强数据分析和数据挖掘技术

获取原文
获取原文并翻译 | 示例

摘要

Abstract: Recently, many computerized data mining tools and environments have been proposed for finding interesting patterns in large data collections. These tools employ techniques that originate from research in various areas, such as machine learning, statistical data analysis, and visualization. Each of these techniques makes assumptions concerning the composition of the data collection to be analyzed. If the particular data collection does not meet these assumptions well, the technique usually performs poorly. For example, decision tree tools, such as C4.5, rely on rectangular approximations, which do not perform well if the boundaries between different classes have other shapes, such as a 45 degree line or elliptical shapes. However, if we could find a transformation f that transforms the original attribute space, in which class boundaries are more, better rectangular approximations could be obtained. In this paper, we address the problem of finding such transformations f. We describe the features of the tool, WOLS, whose goal is the discovery of ingredients for such transformation functions f, which we call building blocks. The tool employs genetic programming and symbolic regression for this purpose. We also present and discuss the results of case studies, using the building block analysis tool, in the areas of decision tree learning and regression analysis.!13
机译:摘要:最近,已经提出了许多计算机化的数据挖掘工具和环境,用于在大型数据集合中查找有趣的模式。这些工具采用了来自各个领域研究的技术,例如机器学习,统计数据分析和可视化。这些技术中的每一种都与要分析的数据集合的组成有关。如果特定的数据收集不能很好地满足这些假设,则该技术通常效果不佳。例如,决策树工具(例如C4.5)依赖于矩形近似,如果不同类之间的边界具有其他形状(例如45度线或椭圆形),则矩形近似效果不佳。但是,如果我们可以找到一个转换f来转换原始属性空间,其中类边界更多,则可以获得更好的矩形近似。在本文中,我们解决了找到此类变换f的问题。我们描述了工具WOLS的功能,其目的是发现此类转换函数f(我们称为构件块)的成分。该工具为此目的采用了遗传编程和符号回归。我们还将在决策树学习和回归分析领域中使用构建模块分析工具介绍并讨论案例研究的结果。13

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号