Efficient Clustering for High Dimensional Data: Subspace Based Clustering and Density Based Clustering

Singh Vijendra

首页> 外文期刊>Information Technology Journal >Efficient Clustering for High Dimensional Data: Subspace Based Clustering and Density Based Clustering

【24h】

Efficient Clustering for High Dimensional Data: Subspace Based Clustering and Density Based Clustering

机译：高维数据的有效聚类：基于子空间的聚类和基于密度的聚类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Finding clusters in a high dimensional data space is challenging because a high dimensional data space has hundreds of attributes and hundreds of data tuples and the average density of data points is very low. The distance functions used by many conventional algorithms fail in this scenario. Clustering relies on computing the distance between objects and thus, the complexity of the similarity models has a severe influence on the efficiency of the clustering algorithms. Especially for density-based clustering, range queries must be supported efficiently to reduce the runtime of clustering. The density-based clustering is also influenced by the density divergence problem that affects the accuracy of clustering. If clusters do not exist in the original high dimensional data space, it may be possible that clusters exist in some subspaces of the original data space. Subspace clustering algorithms localize the search for relevant dimensions allowing them to find clusters that exist in multiple, possibly overlapping subspaces. Subspace clustering algorithms identifies such subspace clusters. But for clustering based on relative region densities in the subspaces, density based subspace clustering algorithms are applied where the clusters are regarded as regions whose densities are relatively high as compared to the region densities in a subspace. This study presents a review of various subspaces based clustering algorithms and density based clustering algorithms with their efficiencies on different data sets.

机译：在高维数据空间中查找群集具有挑战性，因为高维数据空间具有数百个属性和数百个数据元组，并且数据点的平均密度非常低。在这种情况下，许多常规算法使用的距离函数将失败。聚类依赖于计算对象之间的距离，因此，相似度模型的复杂性严重影响了聚类算法的效率。特别是对于基于密度的群集，必须有效地支持范围查询以减少群集的运行时间。基于密度的聚类还受到影响聚类精度的密度发散问题的影响。如果原始高维数据空间中不存在群集，则可能在原始数据空间的某些子空间中存在群集。子空间聚类算法可对相关维度进行本地化搜索，从而使它们能够找到存在于多个可能重叠的子空间中的聚类。子空间聚类算法可识别此类子空间聚类。但是，对于基于子空间中相对区域密度的聚类，应用了基于密度的子空间聚类算法，其中将聚类视为与子空间中的区域密度相比密度相对较高的区域。这项研究提出了各种基于子空间的聚类算法和基于密度的聚类算法及其在不同数据集上的效率的综述。

著录项

来源
《Information Technology Journal》 |2011年第6期|共14页
作者
Singh Vijendra;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类信息处理技术;
关键词
Feature selection; Subspace clustering; Density based clustering; High dimensional data;

机译：特征选择;子空间聚类;基于密度的聚类;高维数据;
入库时间 2022-08-18 10:30:54

相似文献

外文文献
中文文献
专利

1. Efficient Clustering for High Dimensional Data: Subspace Based Clustering and Density Based Clustering [J] . Singh Vijendra Information Technology Journal . 2011,第6期

机译：高维数据的有效聚类：基于子空间的聚类和基于密度的聚类
2. ChronoClust: Density-based clustering and cluster tracking in high-dimensional time-series data [J] . Putri Givanna H., Read Mark N., Koprinska Irena, Knowledge-Based Systems . 2019,第JUNa15期

机译：ChronoClust：高密度时间序列数据中基于密度的聚类和聚类跟踪
3. ChronoClust: Density-based clustering and cluster tracking in high-dimensional time-series data [J] . Putri Givanna H., Read Mark N., Koprinska Irena, Knowledge-Based Systems . 2019,第Juna15期

机译：Chronoclust：高维时间序列数据中基于密度的聚类和群集跟踪
4. DiSCl: Distributed Intelligent Subspace Clustering, a density based clustering approach for very high dimensional distributed dataset [C] . International Conference on Networked Digital Technologies . 2009

机译：透析：分布式智能子空间聚类，基于密度基于高维分布式数据集的聚类方法
5. High-Dimensional Data Clustering and Statistical Analysis of Clustering-based Data Summarization Products. [D] . Zhou, Dunke. 2012

机译：高维数据聚类和基于聚类的数据汇总产品的统计分析。
6. Thumbnail Tensor—A Method for Multidimensional Data Streams Clustering with an Efficient Tensor Subspace Model in the Scale-Space [O] . Bogusław Cyganek 2019

机译：缩略图张量-一种在尺度空间中使用有效张量子空间模型进行多维数据流聚类的方法
7. Clustering for High Dimensional Data: Density based Subspace Clustering Algorithms [O] . Sunita Jahirabadkar, Parag Kulkarni 2013

机译：高维数据的聚类：基于密度的子空间聚类算法

Efficient Clustering for High Dimensional Data: Subspace Based Clustering and Density Based Clustering

摘要

著录项

相似文献

相关主题

期刊订阅