Dual clustering: integrating data clustering over optimization and constraint domains

Cheng-Ru Lin; Ken-Hao Liu; Ming-Syan Chen

首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Dual clustering: integrating data clustering over optimization and constraint domains

【24h】

Dual clustering: integrating data clustering over optimization and constraint domains

机译：双重集群：在优化和约束域上集成数据集群

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Spatial clustering has attracted a lot of research attention due to its various applications. In most conventional clustering problems, the similarity measurement mainly takes the geometric attributes into consideration. However, in many real applications, the nongeometric attributes are what users are concerned about. In the conventional spatial clustering, the input data set is partitioned into several compact regions and data points which are similar to one another in their nongeometric attributes may be scattered over different regions, thus making the corresponding objective difficult to achieve. To remedy this, we propose and explore in this paper a new clustering problem on two domains, called dual clustering, where one domain refers to the optimization domain and the other refers to the constraint domain. Attributes on the optimization domain are those involved in the optimization of the objective function, while those on the constraint domain specify the application dependent constraints. Our goal is to optimize the objective function in the optimization domain while satisfying the constraint specified in the constraint domain. We devise an efficient and effective algorithm, named Interlaced Clustering-Classification, abbreviated as ICC, to solve this problem. The proposed ICC algorithm combines the information in both domains and iteratively performs a clustering algorithm on the optimization domain and also a classification algorithm on the constraint domain to reach the target clustering effectively. The time and space complexities of the ICC algorithm are formally analyzed. Several experiments are conducted to provide the insights into the dual clustering problem and the proposed algorithm.

机译：空间聚类由于其各种应用而吸引了许多研究关注。在大多数常规聚类问题中，相似性度量主要考虑几何属性。但是，在许多实际应用中，用户所关注的是非几何属性。在传统的空间聚类中，将输入数据集划分为几个紧凑的区域，并且其非几何属性彼此相似的数据点可能会散布在不同的区域上，从而使相应的目标难以实现。为了解决这个问题，我们在本文中提出并探索了在两个域上的一个新的聚类问题，称为双重聚类，其中一个域是指优化域，另一个域是约束域。优化域上的属性是目标函数优化所涉及的属性，而约束域上的属性则指定依赖于应用程序的约束。我们的目标是在满足约束域中指定的约束的同时，在优化域中优化目标函数。我们设计了一种有效且有效的算法，称为隔行聚类分类（Interlaced Clustering-Classification，简称ICC）来解决此问题。提出的ICC算法结合了两个域中的信息，并在优化域上迭代执行聚类算法，并在约束域上迭代执行分类算法，以有效地达到目标聚类。正式分析了ICC算法的时间和空间复杂度。进行了一些实验，以提供对双重聚类问题和提出的算法的见解。

著录项

来源
《IEEE Transactions on Knowledge and Data Engineering》 |2005年第5期|p.628-637|共10页
作者
Cheng-Ru Lin; Ken-Hao Liu; Ming-Syan Chen;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
computational complexity; data mining; optimisation; pattern clustering; spatial reasoning; visual databases; Interlaced Clustering-Classification algorithm; constraint domain; dual clustering; nongeometric attribute; optimization domain; spatial data clustering; In;

机译：计算复杂度;数据挖掘;优化;模式聚类;空间推理;可视化数据库;隔行聚类算法;约束域;双重聚类;非几何属性;优化域;空间数据聚类;

相似文献

外文文献
中文文献
专利

1. Integrative clustering of high-dimensional data with joint and individual clusters [J] . Hellton Kristoffer H., Thoresen Magne Biostatistics . 2016,第3期

机译：具有联合和单个集群的高维数据的集成集群
2. Constraints on the duality relation from ACT cluster data [J] . R. S. Gon?alves, A. Bernui, R. F. L. Holanda, Astronomy and astrophysics . 2015,第19期

机译：ACT集群数据对偶关系的约束
3. BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge [J] . Rui Henriques, Sara C. Madeira Algorithms for Molecular Biology . 2016,第1期

机译：BiC2PAM：具有领域知识的生物数据分析的约束导向双簇
4. Clustering Data Streams in Optimization and Geography Domains [C] . Ling-Yin Wei, Wen-Chih Peng Advances in knowledge discovery and data mining . 2009

机译：在优化和地理域中对数据流进行聚类
5. Unsupervised Learning Models for Dual-domain Data with Proximal Geographic Clustering [D] . McMahon, Mallory Elise. 2020

机译：具有近端地理群体的双域数据的无监督学习模型
6. BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge [O] . Rui Henriques, Sara C. Madeira 2016

机译：BiC2PAM：具有领域知识的生物数据分析的约束导向双簇
7. Integrative clustering of high-dimensional data with joint and individual clusters [O] . Kristoffer H. Hellton, Magne Thoresen 2016

机译：具有关节和单个簇的高维数据的综合聚类
8. Methods of Travel-Time Residual Declustering for the Knowledge Base Calibration and Integration Tool (KBCIT) [R] . Meyers, S. C. 2001

机译：知识库校准和集成工具（KBCIT）的行程时残留去聚集方法

Dual clustering: integrating data clustering over optimization and constraint domains

摘要

著录项

相似文献

相关主题

期刊订阅