首页> 外文会议>ACM conference on information and knowledge management >Pattern Discovery for Large Mixed-Mode Database
【24h】

Pattern Discovery for Large Mixed-Mode Database

机译:大型混合模式数据库的模式发现

获取原文

摘要

In business and industry today, large databases with mixed data types (continuous and categorical) are very common. There are great needs to discover patterns from them for knowledge interpretation and understanding. In the past, for classification, this problem is solved as a discrete data problem by first discretizing the continuous data based on the class-attribute interdependence relationship. However, so far no proper solution exists when class information is unavailable. Hence, important pattern post-processing tasks such as pattern clustering and summarization cannot be applied to mixed-mode data. This paper presents a new method for solving the problem. It is based on two essential concepts. (1) Though class information is absent, yet for a correlated dataset, the attribute with the strongest interdependence with others in the group can be used to drive the discretization of the continuous data. (2) For a large database, correlated attribute groups must first be obtained by attribute clustering before (1) can be applied. Based on (1) and (2), pattern discovery methods are developed for mixed-mode data. Extensive experiments using synthetic and real world data were conducted to validate the usefulness and effectiveness of the proposed method.
机译:在当今的商业和工业,混合数据类型(连续和分类)大型数据库是非常普遍的。有很大的需求,发现从他们的模式对知识的解释和理解。在过去,对于分类,这个问题是由第一离散基于类属性的相互依赖关系的连续数据解决作为离散数据的问题。然而,当类信息不可用至今没有妥善解决存在。因此,重要的图案后处理任务,如模式聚类和总结不能被应用到混合模式的数据。本文提出了解决问题的新方法。它是基于两个基本概念。 (1)虽然类的信息不存在的,但对于相关的数据集,可以使用具有与其他组中的最强的相互依赖的属性来驱动连续的数据的离散化。 (2)对于一个大的数据库,相关属性组必须首先通过属性聚类(1)可应用于之前获得。基于(1)和(2)中,模式发现方法可用于混合模式数据开发的。使用合成的和现实世界的广泛数据进行了实验,以验证所提出的方法的有用性和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号