A feature group weighting method for subspace clustering of high-dimensional data

Chen X.; Ye Y.; Xu X.; Huang J.Z.

首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >A feature group weighting method for subspace clustering of high-dimensional data

【24h】

A feature group weighting method for subspace clustering of high-dimensional data

机译：高维数据子空间聚类的特征组加权方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper proposes a new method to weight subspaces in feature groups and individual features for clustering high-dimensional data. In this method, the features of high-dimensional data are divided into feature groups, based on their natural characteristics. Two types of weights are introduced to the clustering process to simultaneously identify the importance of feature groups and individual features in each cluster. A new optimization model is given to define the optimization process and a new clustering algorithm FG-k-means is proposed to optimize the optimization model. The new algorithm is an extension to k-means by adding two additional steps to automatically calculate the two types of subspace weights. A new data generation method is presented to generate high-dimensional data with clusters in subspaces of both feature groups and individual features. Experimental results on synthetic and real-life data have shown that the FG-k-means algorithm significantly outperformed four k-means type algorithms, i.e., k-means, W-k-means, LAC and EWKM in almost all experiments. The new algorithm is robust to noise and missing values which commonly exist in high-dimensional data.

机译：本文提出了一种对特征组和单个特征中的子空间加权的新方法，用于聚类高维数据。在这种方法中，根据高维数据的自然特征将其分为特征组。在聚类过程中引入了两种类型的权重，以同时识别每个聚类中要素组和单个要素的重要性。给出了一个新的优化模型来定义优化过程，并提出了一种新的聚类算法FG-k-means来优化该优化模型。新算法是对k均值的扩展，它增加了两个附加步骤来自动计算两种类型的子空间权重。提出了一种新的数据生成方法，以生成具有特征组和单个特征的子空间中的簇的高维数据。关于合成和真实数据的实验结果表明，在几乎所有实验中，FG-k-means算法均明显优于四种k-means类型算法，即k-means，W-k-means，LAC和EWKM。新算法对高维数据中通常存在的噪声和缺失值具有鲁棒性。

著录项

来源
《Pattern Recognition: The Journal of the Pattern Recognition Society》 |2012年第1期|共13页
作者
Chen X.; Ye Y.; Xu X.; Huang J.Z.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Data mining; Feature weighting; High-dimensional data analysis; k-Means; Subspace clustering;

机译：数据挖掘;特征加权;高维数据分析;k-Means;子空间聚类;
入库时间 2022-08-18 15:50:36

相似文献

外文文献
中文文献
专利

1. A feature group weighting method for subspace clustering of high-dimensional data [J] . Chen X., Ye Y., Xu X., Pattern Recognition: The Journal of the Pattern Recognition Society . 2012,第1期

机译：高维数据子空间聚类的特征组加权方法
2. An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data [J] . Jing Liping, Ng Michael K., Huang Joshua Zhexue IEEE Transactions on Knowledge and Data Engineering . 2007,第8期

机译：高维稀疏数据子空间聚类的熵权k均值算法
3. Model-based approach for high-dimensional non-Gaussian visual data clustering and feature weighting [J] . Elguebaly Tarek, Bouguila Nizar Digital Signal Processing . 2015,第Null期

机译：基于模型的高维非高斯视觉数据聚类和特征权重方法
4. A Soft Subspace Clustering Method for Text Data Using a Probability Based Feature Weighting Scheme [C] . Abdul Wahid, Xiaoying Gao, Peter Andreae International conference on web information systems engineering . 2015

机译：基于概率的特征加权方案的文本数据软子空间聚类方法
5. High-dimensional data mining: Subspace clustering, outlier detection and applications to classification. [D] . Foss, Andrew Philip Ogilvie. 2010

机译：高维数据挖掘：子空间聚类，离群值检测和分类应用。
6. Comparison of Methods for Feature Selection in Clustering of High-Dimensional RNA-Sequencing Data to Identify Cancer Subtypes [O] . David Källberg, Linda Vidman, Patrik Rydén 2021

机译：高尺寸RNA测序数据聚类特征选择方法的比较识别癌症亚型
7. Study of subspace clustering algorithm of high dimensional data based on variable weighting methods [O] . 邓莹, 杨双远, 刘菡 2009

机译：基于可变加权方法的高维数据子空间聚类算法研究

A feature group weighting method for subspace clustering of high-dimensional data

摘要

著录项

相似文献

相关主题

期刊订阅