Clustering by similarity in an auxiliary space

机译：通过辅助空间中的相似性聚类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a clustering method for continuous data. It defines local clusters into the (primary) data space but derives its similarity measure from the posterior distributions of additional discrete data that occur as pairs with the primary data. As a case study, enterprises are clustered by deriving the similarity measure from bankruptcy sensitivity. In another case study, a content-based clustering for text documents is found by measuring differences between their metadata (keyword distributions). We show that minimizing our Kullback-Leibler divergence-based distortion measure within the categories is equivalent to maximizing the mutual information between the categories and the distributions in the auxiliary space. A simple on-line algorithm for minimizing the distortion is introduced for Gaussian basis functions and their analogs on a hypersphere.

机译：我们提出了一种连续数据的聚类方法。它将本地群集定义为（主要）数据空间，而是从与主要数据的对发生的附加离散数据的后部分布导出其相似度。作为一个案例研究，企业通过从破产敏感性中获得相似度措施而聚集。在另一个案例研究中，通过测量它们的元数据（关键字分布）之间的差异来找到基于内容的文本文档的聚类。我们表明，在类别中最小化基于Kullback-Leibler分歧的失真测量相当于最大化类别和辅助空间中的分布之间的相互信息。引入了最小化失真的简单在线算法，用于高斯基础函数及其在极度上的类似物。

著录项

来源
《Intelligent Data Engineering and Automated Learning》|2000年||共6页
会议地点
作者
Janne Sinkkonen; Samuel Kaski;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-532;
关键词

相似文献

外文文献
中文文献
专利

1. Subspace Clustering for High-Dimensional Data Using Cluster Structure Similarity [J] . Kavan Fatehi, Mohsen Rezvani, Mansoor Fateh, International Journal of Intelligent Information Technologies . 2018,第3期

机译：使用集群结构相似性的高维数据子空间聚类
2. Structural similarity and descriptor spaces for clustering and development of QSAR models. [J] . Irene Luque Ruiz, Gonzalo Cerruela García, Miguel Angel Gómez-Nieto Current computer-aided drug design . 2013,第2期

机译：QSAR模型的聚类和开发的结构相似性和描述符空间。
3. CSVD: clustering and singular value decomposition for approximate similarity search in high-dimensional spaces [J] . Castelli V., Thomasian A., Chung-Sheng Li IEEE Transactions on Knowledge and Data Engineering . 2003,第3期

机译：CSVD：聚类和奇异值分解，用于在高维空间中进行近似相似性搜索
4. Clustering by Similarity in an Auxiliary Space [C] . Janne Sinkkonen, Samuel Kaski Second International Conference on Intelligent Data Engineering and Automated Learning - IDEAL 2000: Data Mining, Financial Engineering, and Intelligent Agents , Dec 13-15, 2000, Hong kong, China . 2000

机译：辅助空间中的相似性聚类
5. Ontology-based similarity for clustering in text space. [D] . Assem, Nasser. 2002

机译：基于本体的文本空间聚类相似度。
6. Convex hulls in hamming space enable efficient search for similarity and clustering of genomic sequences [O] . David S. Campo, Yury Khudyakov 2020

机译：汉明空间的凸壳能够有效地寻求基因组序列的相似性和聚类
7. Clustering by Similarity in an Auxiliary Space [O] . Janne Sinkkonen, Samuel Kaski 2000

机译：辅助空间中的相似性聚类

Clustering by similarity in an auxiliary space

摘要

著录项

相似文献

相关主题

期刊订阅