Centroid based Tree-Structured Data Clustering Using Vertex/Edge Overlap and Graph Edit Distance

Dinler Derya; Tural Mustafa Kemal; Ozdemirel Nur Evin

首页> 外文期刊>Annals of Operations Research >Centroid based Tree-Structured Data Clustering Using Vertex/Edge Overlap and Graph Edit Distance

【24h】

Centroid based Tree-Structured Data Clustering Using Vertex/Edge Overlap and Graph Edit Distance

机译：基于Firedroid的树结构数据聚类使用顶点/边缘重叠和图形编辑距离

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider a clustering problem in which the data objects are rooted m-ary trees with known node correspondence. We assume that the nodes of the trees are unweighted, but the edges can be unweighted or weighted. We measure the similarity and distance between two trees using vertex/edge overlap (VEO) and graph edit distance (GED), respectively. For both measures, we first study the problem of finding a centroid tree of a given cluster of trees in both the unweighted and weighted edge cases. We compute the optimal centroid tree of a given cluster for all measures except the weighted VEO for which a heuristic is developed. We then propose k-means based algorithms that repeat cluster assignment and centroid update steps until convergence. The initial centroid trees are constructed based on the properties of the data. The assignment steps utilize unweighted or weighted versions of VEO or GED to assign each tree to the most similar centroid tree. In the update steps, each centroid tree is updated by considering the trees assigned to it. The proposed algorithms are compared with the traditional k-modes and k-means on randomly generated datasets and shown to be more effective and robust (to outliers) in separating trees into clusters. We also apply our algorithms on a real world brain artery data and show that the previously observed age and sex effects on brain artery structures can be revealed better by means of clustering with our algorithms than the traditional k-modes and k-means.

机译：我们考虑一个群集问题，其中数据对象是具有已知节点对应的rooted m-ary树。我们假设树的节点是未加权的，但是边缘可以是未加权的或加权的。我们使用顶点/边缘重叠（VEO）和图表编辑距离（GED）测量两棵树之间的相似性和距离。对于这两项措施，我们首先研究在未加权和加权边缘案件中找到给定树群的质心树的问题。除了开发启发式的加权VEO之外，我们计算给定集群的最佳质心树。然后，我们提出了基于K-Meance的算法，该算法重复群集分配和质心更新步骤，直到收敛。初始质心树基于数据的属性构建。分配步骤利用未加权或加权版本的Veo或GED将每棵树分配给最相似的质心树。在更新步骤中，通过考虑分配给它的树，更新每个质心树。将所提出的算法与随机生成的数据集上的传统k模式和k-means进行比较，并将树木分成簇中的更有效和强大（到异常值）。我们还在现实世界脑动脉数据上应用了我们的算法，并表明以前观察到对脑动脉结构的性别和性别的影响可以通过与传统的K-MODES和K均值的算法进行聚类来更好地揭示。

著录项

来源
《Annals of Operations Research》 |2020年第1期|85-122|共38页
作者
Dinler Derya; Tural Mustafa Kemal; Ozdemirel Nur Evin;
展开▼
作者单位

Hacettepe Univ Dept Ind Engn Ankara Turkey;

Middle East Tech Univ Dept Ind Engn Ankara Turkey;

Middle East Tech Univ Dept Ind Engn Ankara Turkey;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Tree-structured data objects; Clustering; Brain artery data; Vertex; edge overlap; Graph edit distance; k-means; Centroid;

机译：树结构数据对象;聚类;脑动脉数据;顶点;边缘重叠;图编辑距离;k均值;质心;

相似文献

外文文献
中文文献
专利

1. Vertex-based and edge-based centroids of graphs [J] . Lan Yongxin, Li Tao, Ma Yuede, Applied mathematics and computation . 2018,第期

机译：基于顶点的基于边缘的图形质心
2. A MORE RELAXED MODEL FOR GRAPH-BASED DATA CLUSTERING: s-PLEX CLUSTER EDITING [J] . JIONG GUO, CHRISTIAN KOMUSIEWICZ, ROLF NIEDERMEIER, SIAM Journal on Discrete Mathematics . 2011,第4期

机译：基于图形的数据聚类的更宽松模型：s-Plex聚类编辑
3. Geodesic distance based fuzzy c-medoid clustering - searching forcentral points in graphs and high dimensional data [J] . Kiraly Andras, Vathy-Fogarassy Agnes, Abonyi Janos Fuzzy sets and systems . 2016,第Mara1期

机译：基于测地距离的模糊c-medoid聚类-在图形和高维数据中搜索中心点
4. Centroid-based clustering for graph datasets [C] . Chen, Lifei, Wang, Shengrui, Yan, Xuanhui ICPR 2012;International Conference on Pattern Recognition . 2012

机译：图数据集的基于质心的聚类
5. Graph-based data analysis: Tree-structured covariance estimation, prediction by regularized kernel estimation and aggregate database query processing for probabilistic inference. [D] . Bravo, Hector Corrada. 2008

机译：基于图的数据分析：树状协方差估计，通过正则核估计进行预测以及用于概率推断的聚合数据库查询处理。
6. Data on cut-edge for spatial clustering based on proximity graphs [O] . Alper Aksac, Tansel Ozyer, Reda Alhajj 2020

机译：基于邻近图的空间聚类前沿数据
7. On Distance-Based Topological Descriptors of Subdivision Vertex-Edge Join of Three Graphs [O] . Hong Yang, Muhammad Imran, Shehnaz Akhter, 2019

机译：关于三个图形细分顶点的基于距离的拓扑描述符

Centroid based Tree-Structured Data Clustering Using Vertex/Edge Overlap and Graph Edit Distance

摘要

著录项

相似文献

相关主题

期刊订阅