首页> 外文会议>Systems, Man and Cybernetics (SMC), 2008 IEEE International Conference on >Clustering based on Generalized Inverse Transformation
【24h】

Clustering based on Generalized Inverse Transformation

机译:基于广义逆变换的聚类

获取原文

摘要

This paper presents a novel approach which incorporates Dimension Extension and Generalized Inverse Transformation (DEGIT) to realize data clustering. Unlike k-means algorithm, DEGIT needs not pre-specify the number of clusters k, centroid locations are updated and redundant centroids eliminated automatically during iterative training process. The essence of DEGIT is that clustering is performed by generalized inverse transforming the input data such that each data point is represented by a linear combination of bases with extended dimension, with each basis corresponding to a centroid and its coefficient representing the closeness between the data point and the basis. Issue of clustering validation is also addressed in this paper. First, Principal Component Analysis is applied to detect if there exists a dominated dimension, if so, the original input data will be rotated by a certain angle w.r.t. a defined center of mass, and the resulting data undergo another run of iterative training process. After plural runs of rotation and iterative process, the labeled results from various runs are compared, a data point labeled to a centroid more times than others will be labeled to the class indexed by that wining centroid.
机译:本文提出了一种新方法,该方法结合了维数扩展和广义逆变换(DEGIT)来实现数据聚类。与k-means算法不同,DEGIT不需要预先指定聚类k的数量,在迭代训练过程中会自动更新质心位置并自动消除冗余质心。 DEGIT的本质是通过对输入数据进行广义逆变换来进行聚类,从而使每个数据点由具有扩展维的基的线性组合表示,每个基对应于一个质心,其系数代表数据点之间的接近度和基础。本文还讨论了聚类验证的问题。首先,应用主成分分析来检测是否存在主导维,如果存在,则原始输入数据将旋转一定角度w.r.t。确定的质心,然后生成的数据将进行另一轮迭代训练过程。在进行了多次旋转和迭代过程之后,将比较各个运行的标记结果,将一个标记为一个质心的数据点比其他标记被标记更多次,从而将该数据点标记为该获胜的质心所索引的类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号