Learning Discriminative Representations for Big Data Clustering Using Similarity-Based Dimensionality Reduction

机译：使用基于相似性的维度减少学习大数据聚类的判别表征

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Discriminative Clustering techniques simultaneously perform clustering and learn a representation that encourages the separability of the clusters. However, methods with high discriminative power tend to decrease clustering accuracy, since the cluster assignments are usually noisy. In this paper, a similarity-based dimensionality reduction method, that allows for learning regularized clustering-oriented representations and is able to efficiently scale to large datasets, is proposed. We avoid the pitfalls of highly discriminative methods, such as the Linear Discriminant Analysis (LDA), by maintaining a small similarity between the inter-cluster samples and a small dissimilarity between the intra-cluster samples instead of collapsing the intra-cluster samples and pushing the clusters as far apart as possible. Three datasets are used to demonstrate the ability of the proposed method to learn robust representations that improve the quality of the obtained clustering solutions over other clustering techniques.

机译：判别聚类技术同时执行群集并学习鼓励群集可分离的表示。然而，具有高鉴别力的方法倾向于降低聚类准确性，因为群集分配通常是嘈杂的。在本文中，提出了一种基于相似性的维度减少方法，其允许学习正则化聚类的表示和能够有效地缩放到大型数据集。我们避免了高度辨别方法的陷阱，例如线性判别分析（LDA），通过维持簇间样本与簇内样品之间的小的相似性而不是折叠簇内样品并推动群集尽可能远。三个数据集用于展示所提出的方法学习鲁棒表示的能力，从而通过其他聚类技术提高所获得的聚类解决方案的质量。

著录项

来源
《IEEE Image, Video, and Multidimensional Signal Processing Workshop》|2018年|169p|共5页
会议地点
作者
Nikolaos Passalis; Anastasios Tefas;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN911.7-53;
关键词
Task analysis; Optimization; Clustering algorithms; Dimensionality reduction; Robustness; Clustering methods; Linear programming;

机译：任务分析;优化;聚类算法;减少维度;鲁棒性;聚类方法;线性规划;

相似文献

外文文献
中文文献
专利

1. Mixture-of-Experts Variational Autoencoder for clustering and generating from similarity-based representations on single cell data [J] . Andreas Kopf, Vincent Fortuin, Vignesh Ram Somnath, PLoS Computational Biology . 2021,第6期

机译：专注于专家变分性AutoEncoder用于聚类和生成单个小区数据的相似性的表示
2. web-rMKL: a web server for dimensionality reduction and sample clustering of multi-view data based on unsupervised multiple kernel learning [J] . Benedict R?der, Nicolas Kersten, Marius Herr, Nucleic acids research . 2019,第W1期

机译：web-rMKL：一种基于无监督多核学习的降维和多视图数据样本聚类的Web服务器
3. Local Dimensionality Reduction and Supervised Learning Within Natural Clusters for Biomedical Data Analysis [J] . Pechenizkiy M., Tsymbal A., Puuronen S. IEEE transactions on information technology in biomedicine . 2006,第3期

机译：自然簇中局部维数的减少和有监督的学习，用于生物医学数据分析
4. Learning Discriminative Representations for Big Data Clustering Using Similarity-Based Dimensionality Reduction [C] . Nikolaos Passalis, Anastasios Tefas IEEE Image, Video, and Multidimensional Signal Processing Workshop . 2018

机译：使用基于相似度的降维学习大数据聚类的判别表示
5. Data Dimensionality Reduction Through Cluster Trees and Manifold Learning [D] . Amani, Ali. 2021

机译：通过集群树木和多方面学习减少数据维度
6. web-rMKL: a web server for dimensionality reduction and sample clustering of multi-view data based on unsupervised multiple kernel learning [O] . Benedict Röder, Nicolas Kersten, Marius Herr, 2019

机译：web-rMKL：一种基于无监督多核学习的降维和多视图数据样本聚类的Web服务器
7. Figure 10: Representation of the clusters and first two components of the resulting feature vector after dimensionality reduction of the Table 1 dataset using (A) PCA and (B) LDA. [O] . -1

机译：图10：使用（a）pca和（b）LDA的表1数据集的维数减少后，簇的表示和所得特征向量的前两个组件。

Learning Discriminative Representations for Big Data Clustering Using Similarity-Based Dimensionality Reduction

摘要

著录项

相似文献

相关主题

期刊订阅