首页> 外文期刊>Knowledge-Based Systems >GBK-means clustering algorithm: An improvement to the K-means algorithm based on the bargaining game
【24h】

GBK-means clustering algorithm: An improvement to the K-means algorithm based on the bargaining game

机译:GBK-Means聚类算法:基于议价游戏的K-Means算法改进

获取原文
获取原文并翻译 | 示例

摘要

Due to its simplicity, versatility and the diversity of applications to which it can be applied, k-means is one of the well-known algorithms for clustering data. The foundation of this algorithm is based on the distance measure. However, the traditional k-means has some weaknesses that appear in some data sets related to real applications, the most important of which is to consider only the distance criterion for clustering. Various studies have been conducted to address each of these weaknesses to achieve a balance between quality and efficiency. In this paper, a novel k-means variant of the original algorithm is proposed. This approach leverages the power of bargaining game modelling in the k-means algorithm for clustering data. In this novel setting, cluster centres compete with each other to attract the largest number of similar objectives or entities to their cluster. Thus, the centres keep changing their positions so that they have smaller distances with the maximum possible data than other cluster centres. We name this new algorithm the game-based k-means (GBK-means) algorithm. To show the superiority and efficiency of GBK-means over conventional clustering algorithms, namely, k-means and fuzzy k-means, we use the following syntactic and real-world data sets: (1) a series of two-dimensional syntactic data sets; and (2) ten benchmark data sets that are widely used in different clustering studies. The evaluation criteria show GBK-means is able to cluster data more accurately than classical algorithms based on eight evaluation metrics, namely F-measure, the Dunn index (DI), the rand index (RI), the Jaccard index (JI), normalized mutual information (NMI), normalized variation of information (NVI), the measure of concordance and error rate (ER). (C) 2020 Elsevier B.V. All rights reserved.
机译:由于其简单性,多功能性和可以应用的应用程序的多样性,K-Means是用于聚类数据的众所周知的算法之一。该算法的基础基于距离测量。但是,传统的K-means在与真实应用程序相关的一些数据集中出现的一些弱点,其中最重要的是仅考虑聚类的距离标准。已经进行了各种研究以解决这些缺点,以实现质量和效率之间的平衡。在本文中,提出了一种新的k均值的原始算法的变体。这种方法利用K-Means算法中讨价还价游戏建模的力量进行群集数据。在这一新颖的环境中,集群中心互相竞争,以吸引最大数量的类似目标或实体。因此,该中心继续改变其位置,使得它们具有比其他集群中心的最大可能数据更小的距离。我们将这种新算法命名为基于游戏的K-means(GBK-Means)算法。为了以传统的聚类算法展示GBK-mease的优越性和效率,即K-means和模糊k-means,我们使用以下句法和现实世界数据集:(1)一系列二维语法数据集; (2)十个基准数据集,广泛用于不同的聚类研究。评估标准显示GBK-inse比八个评估度量,即F-Measure,Dunn指数(DI),rand Index(ji),randized相互信息(NMI),信息的正常化变化(NVI),衡量一致性和错误率(ER)。 (c)2020 Elsevier B.v.保留所有权利。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号