【24h】

Cross-relational clustering with user's guidance

机译:跨关系聚类与用户指导

获取原文

摘要

Clustering is an essential data mining task with numerous applications. However, data in most real-life applications are high-dimensional in nature, and the related information often spreads across multiple relations. To ensure effective and efficient high-dimensional, cross-relational clustering, we propose a new approach, called CrossClus, which performs cross-relational clustering with user's guidance. We believe that user's guidance, even likely in very simple forms, could be essential for effective high-dimensional clustering since a user knows well the application requirements and data semantics. CrossClus is carried out as follows: A user specifies a clustering task and selects one or a small set of features pertinent to the task. CrossClus extracts the set of highly relevant features in multiple relations connected via linkages defined in the database schema, evaluates their effectiveness based on user's guidance, and identifies interesting clusters that fit user's needs. This method takes care of both quality in feature extraction and efficiency in clustering. Our comprehensive experiments demonstrate the effectiveness and scalability of this approach.
机译:集群是众多应用程序中必不可少的数据挖掘任务。但是,大多数现实生活中的应用程序中的数据本质上都是高维的,并且相关的信息通常分布在多个关系中。为了确保有效和高效的高维,跨关系聚类,我们提出了一种名为CrossClus的新方法,该方法在用户的指导下执行跨关系聚类。我们认为,即使用户以非常简单的形式进行指导,也可能对于有效的高维聚类至关重要,因为用户非常了解应用程序需求和数据语义。 CrossClus的执行方式如下:用户指定一个群集任务,并选择与该任务相关的一个或一小套功能。 CrossClus提取通过数据库模式中定义的链接连接的多个关系中的一组高度相关的特征,并根据用户的指导评估其有效性,并确定符合用户需求的有趣集群。该方法既要注意特征提取的质量,又要注意聚类的效率。我们的综合实验证明了这种方法的有效性和可扩展性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号