Combined clustering of graph and attribute data

机译：图形和属性数据的组合聚类

页面导航

摘要
著录项
相似文献
相关主题

摘要

In recent years, a rapidly increasing amount of data is collected and stored for various applications. As modern storage systems provide increasing disk space at decreasing costs, databases storing huge amounts of information of different types are ubiquitous. The task of automatically extracting useful and previously unknown knowledge out of such data is called data mining. This thesis focuses on the data mining task of clustering, i.e. grouping objects into clusters such that objects assigned to the same cluster are similar to each other, while objects assigned to different clusters are dissimilar. Two of the most common data types are vector data, where each object is represented as a vector containing different attributes of the object, and graph data, which represents relationships between different objects as edges in a graph. In many applications, data of both types is available simultaneously: for the vertices or the edges of a graph, additional information is available which can be described as an attribute vector. The aim of this thesis is to develop combined clustering approaches that use graph data and attribute data simultaneously in order to detect clusters that are densely connected in the graph and at the same time show similarity in the attribute space. As for high-dimensional vector data, clusters usually exist only in subspaces of the attribute space, we follow the principle of subspace clustering to enable the detection of clusters which show similarity only in a subset of the attributes. In this thesis, we introduce combined clustering approaches for graphs with vertex attributes, graphs with edge attributes and heterogeneous networks with attributed vertices. For all of those data types, our approaches focus on realizing an unbiased combination of graph and attribute data and avoiding redundancy in the clustering result.

机译：近年来，为各种应用收集并存储了数量迅速增加的数据。随着现代存储系统以降低的成本提供增加的磁盘空间，存储大量不同类型信息的数据库无处不在。从此类数据中自动提取有用且先前未知的知识的任务称为数据挖掘。本文主要研究聚类的数据挖掘任务，即将对象分组到聚类中，以使分配给同一聚类的对象彼此相似，而分配给不同聚类的对象互不相同。最常见的两种数据类型是矢量数据（其中每个对象表示为包含对象的不同属性的矢量）和图形数据（其表示不同对象之间的关系作为图形中的边）。在许多应用程序中，两种类型的数据是同时可用的：对于顶点或图形的边缘，可以使用附加信息，这些信息可以描述为属性向量。本文的目的是开发同时使用图形数据和属性数据的组合聚类方法，以检测在图形中密集连接的聚类，同时在属性空间中显示相似性。对于高维向量数据，聚类通常仅存在于属性空间的子空间中，我们遵循子空间聚类的原理，可以检测仅在属性子集中显示相似性的聚类。本文针对具有顶点属性的图，具有边缘属性的图以及具有属性顶点的异构网络，介绍了组合聚类方法。对于所有这些数据类型，我们的方法着重于实现图形和属性数据的无偏组合，并避免聚类结果的冗余。

著录项

作者
Boden Brigitte;
展开▼
作者单位

展开▼
年度 2014
总页数
原文格式 PDF
正文语种 eng
中图分类

相似文献

外文文献
中文文献
专利

1. Combining attribute content and label information for categorical data ensemble clustering [J] . Yu Liqin, Cao Fuyuan, Zhao Xingwang, Applied mathematics and computation . 2020,第1期

机译：组合属性内容和分类数据集群集群的标签信息
2. Community Detection Algorithm Combining Stochastic Block Model and Attribute Data Clustering [J] . Kataoka Shun, Kobayashi Takuto, Yasuda Muneki, Journal of the Physical Society of Japan . 2016,第11期

机译：随机块模型与属性数据聚类相结合的社区检测算法
3. Combined use of association rules mining and clustering methods to find relevant links between binary rare attributes in a large data set [J] . Marie Plasse, Ndeye Niang, Gilbert Saporta, Computational statistics & data analysis . 2007,第1期

机译：结合使用关联规则挖掘和聚类方法来查找大型数据集中的二进制稀有属性之间的相关链接
4. Integrated KL (K-means - Laplacian) Clustering: A New Clustering Approach by Combining Attribute Data and Pairwise Relations [C] . Fei Wang, Chris Ding, Tao Li SIAM International Conference on Data Mining . 2009

机译：集成KL（K-Means - Laplacian）聚类：通过组合属性数据和成对关系来实现新的聚类方法
5. Joint hedonic travel cost method: Combining revealed and stated preference data to estimate demand for attribute quality of sport fishing in Illinois. [D] . Araujo, Rogerio C. Pereira. 2002

机译：联合享乐旅行成本法：结合显示和陈述的偏好数据来估计对伊利诺伊州运动钓鱼属性质量的需求。
6. Complete bibliographic data cluster assignments and combined citation network of emergency response operations research extant literature [O] . J.P. Minas, N.C. Simpson, Z.Y. Tacheva 2020

机译：完整的书目数据集群分配和紧急响应运作的组合网络研究现存文学
7. Integrated KL (K-means- Laplacian) Clustering: A New Clustering Approach by Combining Attribute Data and Pairwise Relations [O] . Fei Wang, Chris Ding, Tao Li 2009

机译：集成的KL（K均值-拉普拉斯算子）聚类：通过结合属性数据和成对关系的新聚类方法

Combined clustering of graph and attribute data

摘要

著录项

相似文献

相关主题

期刊订阅