Vertical Set Square Distance Based Clustering without Prior Knowledge of K

机译：垂直设定方形距离基于k的群体K.

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering is automated identification of groups of objects based on similarity. In clustering two major research issues are scalability and the requirement of domain knowledge to determine input parameters. Most approaches suggest the use of sampling to address the issue of scalability. However, sampling does not guarantee the best solution and can cause significant loss in accuracy. Most approaches also require the use of domain knowledge, trial and error techniques, or exhaustive searching to figure out the required input parameters. In this paper we introduce a new clustering technique based on the set square distance. Cluster membership is determined based on the set squared distance to the respective cluster. As in the case of mean for k-means and median for k-medoids, the cluster is represented by the entire cluster of points for each evaluation of membership. The set square distance for all n items can be computed efficiently in O(n) using a vertical data structure and a few pre-computed values. Special ordering of the set square distance is used to break the data into the "natural" clusters compared to the need of a known k for k-means or k-medoids type of partition clustering. Superior results are observed when the new clustering technique is compared with the classical k-means clustering. To prove the cluster quality and the resolution of the unknown k, data sets with known classes such as the iris data, the uci_kdd network intrusion data, and synthetic data are used. The scalability of the proposed technique is proved using a large RSI data set.

机译：群集是基于相似性的自动识别对象组。在聚类中，两个主要的研究问题是可扩展性和域知识要求确定输入参数的要求。大多数方法表明使用采样来解决可扩展性问题。但是，采样不保证最佳解决方案，可造成精确损失。大多数方法还需要使用域知识，试验和错误技术，或穷举搜索来弄清楚所需的输入参数。在本文中，我们介绍了一种基于设定的方形距离的新集群技术。基于与相应群集的集合方形距离确定群集成员资格。如在K-measoids的k均值和中值的情况下，群集由整个成员资格评估的整个点集群代表。可以使用垂直数据结构和一些预计算值在O（n）中有效地计算所有n项的设置方距。与可知k用于K-means或K-meDoids类型的分区聚类的特殊订购，用于将数据分解为“自然”簇中的数据。当新的聚类技术与古典K-Means聚类进行比较时，观察到卓越的结果。为了证明群集质量和未知k的分辨率，使用具有已知类的数据集，例如IRIS数据，UCI_KDD网络入侵数据和合成数据。使用大RSI数据集来证明所提出的技术的可扩展性。

著录项

来源
《International Conference on Intelligent and Adaptive Systems and Software Engineering》|2005年||共6页
会议地点
作者
Amal Perera; Taufik Abidin; Masum Serazi; George Hamer; William Perrizo;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
Vertical Set Square Distance; P-trees; Clustering;

机译：垂直设定方形距离;p树;聚类;

相似文献

外文文献
中文文献
专利

1. A New Approach to Cluster Datasets without Prior Knowledge of Number of Clusters [J] . Ch Swetha Swapna, V V Kumar, J V R Murthy Journal of Scientific & Industrial Research . 2015,第5期

机译：无需事先了解群集数的群集数据集的新方法
2. AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number [J] . Aaron M Newman, James B Cooper BMC Bioinformatics . 2010,第1期

机译：AutoSOME：一种无需事先知道簇号即可识别基因表达模块的簇方法
3. Enhancing Least Squares GNSS Positioning with 3D Mapping without Accurate Prior Knowledge [J] . Adjrad Mounir, Groves Paul D. Navigation . 2017,第1期

机译：在没有准确先验知识的情况下使用3D映射增强最小二乘GNSS定位
4. Vertical Set Square Distance Based Clustering without Prior Knowledge of K [C] . Amal Perera, Taufik Abidin, Masum Serazi, International Conference on Intelligent and Adaptive Systems and Software Engineering . 2005

机译：垂直设定方形距离基于k的群体K.
5. On certain sets of matrices: Euclidean squared distance matrices, ray -nonsingular matrices and matrices generated by reflections. [D] . Milligan, Thomas W. 2004

机译：在某些矩阵集上：欧几里得平方距离矩阵，射线非奇异矩阵和反射生成的矩阵。
6. AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number [O] . Aaron M Newman, James B Cooper 2010

机译：AutoSOME：一种无需事先了解簇号即可识别基因表达模块的簇方法
7. AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number [O] . Aaron M Newman, James B Cooper 2010

机译：AutoSOME：一种无需事先了解簇号即可识别基因表达模块的簇方法
8. Absolute Thickness Measurements on Coatings Without Prior Knowledge of Material Properties Using Terahertz Energy. [R] . Roth, D. J., Cosgriff, L. M., Harder, B., 2013

机译：使用太赫兹能量在没有材料特性知识的情况下对涂层进行绝对厚度测量。

Vertical Set Square Distance Based Clustering without Prior Knowledge of K

摘要

著录项

相似文献

相关主题

期刊订阅