首页> 外文会议>Data Mining Workshops, ICDMW, 2008 IEEE International Conference on >If Constraint-Based Mining is the Answer: What is the Constraint? (Invited Talk)
【24h】

If Constraint-Based Mining is the Answer: What is the Constraint? (Invited Talk)

机译:如果基于约束的挖掘是答案:什么是约束? (特邀演讲)

获取原文

摘要

Constraint-based mining has been proven to be extremely useful. It has been applied not only to many pattern discovery settings (e.g., for sequential pattern mining) but also, recently, on classification and clustering tasks). It appears as a key technology for an inductive database perspective on knowledge discovery in databases (KDD), and constraint-based mining is indeed an answer to important data mining issues (e.g., for supporting a priori relevancy and subjective interestingness but also to achieve computational feasibility). However, few authorsstudy the nature of constraints and their semantics. Considering several examples of non trivial KDD processes, we discuss the Hows, Whys, and Whens of constraints. Our thesis is that most of the typical data mining methods are constraint-based techniques and that it is worth studying and designing them as such. In many cases, we exploit constraints that are not really explicit (e.g., theobjective function optimization of a clustering for a givensimilarity measure) and/or constraints whose operational semantics are relaxed w.r.t. their declarative counterparts (e.g., the optimization constraint is not enforced because of some local optimization heuristics). We think that is important to explicit every primitive constraint and the operators that combine them because this constitutes the declarative semantics of the constraints and thus the mining queries. Then, a well-studied challenge is to design some operational semantics like correct and complete solvers and/or relaxation schemes for more or less complexconstraints. Designing complete solvers has been extensively studied in useful but yet limited settings (see, e.g., the algorithms for exploiting combinations of monotonic and anti-monotonic primitives). It is however clear that many relevant constraints lack from such nice properties. On another hand, understanding constraint relaxation strategies remains fairly open, certainly because of its intrinsically heuristi--c nature. Interestingly, the recent approaches that suggest global pattern or model construction based on local patterns enable to revisit the relaxation issue thanks to constraint back propagation possibilities. This can be discussed within a case study on constrained co-clustering.
机译:事实证明,基于约束的挖掘非常有用。它不仅已应用于许多模式发现设置(例如,用于顺序模式挖掘),而且最近还应用于分类和聚类任务。它似乎是归纳数据库透视数据库中知识发现(KDD)的关键技术,基于约束的挖掘确实是对重要数据挖掘问题的解答(例如,为了支持先验相关性和主观兴趣,同时也实现了计算能力)。可行性)。但是,很少有人研究约束的性质及其语义。考虑非平凡KDD流程的几个示例,我们讨论约束的方式,原因和时间。我们的论文认为,大多数典型的数据挖掘方法都是基于约束的技术,因此值得进行研究和设计。在许多情况下,我们会利用不是真正明确的约束条件(例如,针对给定相似性度量的聚类的目标函数优化)和/或操作语义放宽的约束条件。它们的声明性对应物(例如,由于某些局部优化试探法,未强制执行优化约束)。我们认为这对于显露每个原始约束及其组合的运算符很重要,因为这构成了约束的声明性语义,从而构成了挖掘查询。然后,一个经过充分研究的挑战是针对或多或少的复杂约束设计一些操作语义,例如正确和完整的求解器和/或松弛方案。设计完整的求解器已在有用但有限的条件下进行了广泛的研究(例如,参见利用单调和反单调基元组合的算法)。但是很明显,如此好的属性缺少许多相关的约束。另一方面,当然由于其固有的启发式性质,对约束放松策略的理解仍然相当开放。有趣的是,最近的方法建议全局模式或基于局部模式的模型构建,由于约束反向传播的可能性,使得能够重新审视松弛问题。这可以在有关约束共聚的案例研究中讨论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号