...
首页> 外文期刊>WSEAS Transactions on Communications >Inter Cluster Distance Management Model with Optimal Centroid Estimation for K-Means Clustering Algorithm
【24h】

Inter Cluster Distance Management Model with Optimal Centroid Estimation for K-Means Clustering Algorithm

机译:K均值聚类算法的最优质心估计的聚类间距离管理模型

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Clustering techniques are used to group up the transactions based on the relevancy. Cluster analysis is one of the primary data analysis method. The clustering process can be done in two ways such that Hierarchical clusters and partition clustering. Hierarchical clustering technique uses the structure and data values. The partition clustering technique uses the data similarity factors. Transactions are partitioned into small groups. K-means clustering algorithm is one of the widely used clustering algorithms. Local cluster accuracy is high in the K-means clustering algorithm. Inter cluster relationship is not concentrated in the K-means algorithm. K-means clustering algorithm requires the cluster count as the major input. The system chooses random transactions are initial centroid for each cluster. Cluster accuracy is associated with the initial centroid estimation process. The random transaction based centroid selection model may choose similar transactions. In this case the cluster accuracy is limited with respect to the distance between the centroid values. The proposed system is designed to improve the K-means clustering algorithm with efficient centroid estimation models. Three centroid estimation models are proposed system. They are random selection with distance management, mean distance model and inter cluster distance model. Cosine distance measure and Euclidean distance measure are used to estimate similarity between the transactions. Three centroid estimation models are tested with two distance measure schemes. Precision and recall and fitness measure are used to test the cluster accuracy levels. Java language and Oracle database are selected for the system development.
机译:聚类技术用于根据相关性对交易进行分组。聚类分析是主要的数据分析方法之一。可以采用两种方法来完成集群过程,例如,层次集群和分区集群。层次聚类技术使用结构和数据值。分区聚类技术使用数据相似性因子。交易被分成几个小组。 K-均值聚类算法是广泛使用的聚类算法之一。在K均值聚类算法中,局部聚类精度很高。聚类间关系不集中在K-means算法中。 K均值聚类算法需要将聚类计数作为主要输入。系统为每个集群选择随机事务作为初始质心。聚类精度与初始质心估计过程相关。基于随机交易的质心选择模型可以选择类似的交易。在这种情况下,聚类精度相对于质心值之间的距离受到限制。提出的系统旨在通过有效的质心估计模型来改进K-means聚类算法。提出了三种质心估计模型。它们是具有距离管理,平均距离模型和集群间距离模型的随机选择。余弦距离度量和欧几里德距离度量用于估计事务之间的相似性。使用两个距离测量方案测试了三个质心估计模型。精度,召回率和适用性度量用于测试聚类准确性级别。选择Java语言和Oracle数据库进行系统开发。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号