首页> 外文期刊>Artificial intelligence >Itemset mining: A constraint programming perspective
【24h】

Itemset mining: A constraint programming perspective

机译:项目集挖掘:约束编程的观点

获取原文
获取原文并翻译 | 示例

摘要

The field of data mining has become accustomed to specifying constraints on patterns of interest. A large number of systems and techniques has been developed for solving such constraint-based mining problems, especially for mining itemsets. The approach taken in the field of data mining contrasts with the constraint programming principles developed within the artificial intelligence community. While most data mining research focuses on algorithmic issues and aims at developing highly optimized and scalable implementations that are tailored towards specific tasks, constraint programming employs a more declarative approach. The emphasis lies on developing high-level modeling languages and general solvers that specify what the problem is, rather than outlining how a solution should be computed, yet are powerful enough to be used across a wide variety of applications and application domains. This paper contributes a declarative constraint programming approach to data mining. More specifically, we show that it is possible to employ off-the-shelf constraint programming techniques for modeling and solving a wide variety of constraint-based itemset mining tasks, such as frequent, closed, discriminative, and cost-based itemset mining. In particular, we develop a basic constraint programming model for specifying frequent itemsets and show that this model can easily be extended to realize the other settings. This contrasts with typical procedural data mining systems where the underlying procedures need to be modified in order to accommodate new types of constraint, or novel combinations thereof. Even though the performance of state-of-the-art data mining systems outperforms that of the constraint programming approach on some standard tasks, we also show that there exist problems where the constraint programming approach leads to significant performance improvements over state-of-the-art methods in data mining and as well as to new insights into the underlying data mining problems. Many such insights can be obtained by relating the underlying search algorithms of data mining and constraint programming systems to one another. We discuss a number of interesting new research questions and challenges raised by the declarative constraint programming approach to data mining.
机译:数据挖掘领域已经习惯于指定对感兴趣模式的约束。已经开发了许多系统和技术来解决这种基于约束的挖掘问题,尤其是对于挖掘项目集。数据挖掘领域中采用的方法与人工智能界内部开发的约束编程原理形成对比。虽然大多数数据挖掘研究都集中在算法问题上,旨在开发针对特定任务量身定制的高度优化和可扩展的实现,但是约束编程采用了更具声明性的方法。重点在于开发指定问题所在的高级建模语言和通用求解器,而不是概述解决方案的计算方式,但要强大到足以在各种应用程序和应用程序域中使用。本文为数据挖掘提供了一种声明式约束编程方法。更具体地说,我们表明可以采用现成的约束编程技术来建模和求解各种基于约束的项集挖掘任务,例如频繁,封闭,区分和基于成本的项集挖掘。特别是,我们开发了一个用于指定频繁项目集的基本约束编程模型,并表明该模型可以轻松扩展以实现其他设置。这与典型的过程数据挖掘系统形成对比,在常规过程数据挖掘系统中,需要修改底层过程以适应新的约束类型或其新颖组合。即使在某些标准任务上,最新数据挖掘系统的性能优于约束编程方法,我们也显示出存在一些问题,其中约束编程方法导致相对于最新状态的显着性能改进数据挖掘方面的先进方法,以及对潜在数据挖掘问题的新见解。通过将数据挖掘和约束编程系统的基础搜索算法相互关联,可以获得许多这样的见解。我们讨论了声明式约束编程方法在数据挖掘中提出的许多有趣的新研究问题和挑战。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号