...
首页> 外文期刊>Data mining and knowledge discovery >High-Performance Commercial Data Mining: A Multistrategy Machine Learning Application
【24h】

High-Performance Commercial Data Mining: A Multistrategy Machine Learning Application

机译:高性能商业数据挖掘:多策略机器学习应用程序

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

We present an application of inductive concept learning and interactive visualization techniques to a large-scale commercial data mining project. This paper focuses on design and configuration of high-level optimization systems (wrappers) for relevance determination and constructive induction, and on integrating these wrappers with elicited knowledge on attribute relevance and synthesis. In particular, we discuss decision support issues for the application (cost prediction for automobile insurance markets in several states) and report experiments using D2K, a Java-based visual programing system for data mining and information visualization, and several commercial and research tools. We describe exploratory clustering, descriptive statistics, and supervised decision tree learning in this application, focusing on a parallel genetic algorithm (GA) system, Jenesis, which is used to implement relevance determination (attribute subset selection). Deployed on several high-performance network-of-workstation systems (Beowulf clusters), Jenesis achieves a linear speedup, due to a high degree of task parallelism. Its test set accuracy is significantly higher than that of decision tree inducers alone and is comparable to that of the best extant search-space based wrappers.
机译:我们提出归纳概念学习和交互式可视化技术在大型商业数据挖掘项目中的应用。本文着重于用于相关性确定和建设性归纳的高级优化系统(包装器)的设计和配置,以及将这些包装器与对属性相关性和综合的知识相结合。特别是,我们讨论了应用程序的决策支持问题(若干州的汽车保险市场的成本预测),并使用D2K(基于Java的用于数据挖掘和信息可视化的可视编程系统以及几种商业和研究工具)报告了实验。我们在此应用程序中描述探索性聚类,描述性统计数据和监督决策树学习,重点关注并行遗传算法(GA)系统Jenesis,该系统用于实现相关性确定(属性子集选择)。由于高度的任务并行性,Jenesis部署在几个高性能的工作站网络系统(Beowulf群集)上,实现了线性加速。它的测试集精度明显高于单独的决策树诱导器,并且可以与最佳现存的基于搜索空间的包装器相比。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号