...
首页> 外文期刊>International Journal of Pattern Recognition and Artificial Intelligence >Automatic Scale Parameters in Affinity Matrix Construction for Improved Spectral Clustering
【24h】

Automatic Scale Parameters in Affinity Matrix Construction for Improved Spectral Clustering

机译:亲和矩阵构造中的自动缩放参数可改善光谱聚类

获取原文
获取原文并翻译 | 示例

摘要

Spectral clustering partitions data into similar groups in the eigenspace of the affinity matrix. The accuracy of the spectral clustering algorithm is a r ected by the affine equivariance realized in the translation of distance to similarity relationship. The similarity value computed as a Gaussian of the distance between data objects is sensitive to the scale factor sigma. The value of sigma, a control parameter of drop in affinity value, is generally a fixed constant or determined by manual tuning. In this research work, ffiis determined automatically from the distance values i. e. the similarity relationship that exists in the real data space. The affinity value of a data pair is determined as a location estimate of the spread of distance values of the data points with the other points. The scale factor sigma(i) corresponding to a data point x(i) is computed as the trimean of its distance vector and used in fixing the scale to compute the affinity matrix. Our proposed automatic scale parameter for spectral clustering resulted in a robust similarity matrix which is affine equivariant with the distance distribution and also eliminates the overhead of manual tuning to find the best sigma value. The performance of spectral clustering using such affinity matrices was analyzed using UCI data sets and image databases. The obtained scores for NMI, ARI, Purity and F-score were observed to be equivalent to those of existing works and better for most of the data sets. The proposed scale factor was used in various state-of-the-art spectral clustering algorithms and it proves to perform well irrespective of the normalization operations applied in the algorithms. A comparison of clustering error rates obtained for various data sets across the algorithms shows that the proposed automatic scale factor is successful in clustering the data sets equivalent to that obtained using manually tuned best sigma value. Thus the automatic scale factor proposed in this research work eliminates the need for exhaustive grid search for the best scale parameter that results in best clustering performance.
机译:频谱聚类在亲和矩阵的本征空间中将数据划分为相似的组。光谱聚类算法的准确性取决于距离到相似关系的平移中实现的仿射等方差。计算为数据对象之间距离的高斯关系的相似度值对比例因子sigma敏感。 sigma是亲和力值下降的控制参数,通常为固定常数或通过手动调整确定。在这项研究工作中,ffiis由距离值i自动确定。 e。真实数据空间中存在的相似关系。数据对的亲和力值被确定为数据点与其他点之间的距离值扩展的位置估计。计算对应于数据点x(i)的比例因子sigma(i)作为其距离向量的坐标,并用于固定比例以计算亲和度矩阵。我们提出的用于频谱聚类的自动缩放参数产生了一个鲁棒的相似矩阵,该矩阵与距离分布仿射等价,并且消除了手动调谐以寻找最佳sigma值的开销。使用UCI数据集和图像数据库分析了使用此类亲和矩阵进行光谱聚类的性能。观察到的NMI,ARI,纯度和F得分的得分与现有工作相当,并且对于大多数数据集而言更好。所提出的比例因子已用于各种最新的频谱聚类算法中,并且证明了其性能良好,而与算法中应用的归一化操作无关。通过算法对各种数据集获得的聚类错误率的比较表明,所提出的自动比例因子可以成功地聚类与使用手动调整的最佳sigma值获得的数据集等效的数据。因此,这项研究工作中提出的自动比例因子消除了对穷举网格搜索以获取最佳群集参数的最佳比例参数的需求。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号