首页> 中文期刊>计算机应用研究 >基于最大中心间隔的缩放型η-极大熵聚类算法

基于最大中心间隔的缩放型η-极大熵聚类算法

     

摘要

为了调控数据之间的差异性,一般化的处理方式是对数据简单地进行按比例缩放,而此类做法本身对于数据的信息是不存在任何破坏的.但在进行聚类分析时,大部分算法对于按比缩放的数据都是很敏感的,其中较典型的算法有极大熵聚类(MEC)算法.大量的实验表明,当缩放尺度位于10-3数量级以下时,极大熵聚类算法已经失效,通过该算法得到的聚类中心趋于一致.为了解决上述问题,在MEC算法的基础上引入最大中心间隔项与缩放因子η,构造出了全新的目标函数,称为η型最大中心间隔极大熵聚类(η-MCS-MEC)算法.该算法通过调控中心点间的距离使之达到最大,并有效利用缩放因子η对各类划分进行调控,从而避免了聚类中心趋于一致.通过在模拟数据集以及UCI仿真数据集上的实验,结果均显示出算法对变化的数据不再敏感而具有鲁棒性.%In order to control the difference between data, the general way is to scale the data proportionally, and such practices itself do not have any damage to the information of data. However, most algorithms are very sensitive to the scaling data in the cluster analysis and one of the typical algorithms is MEC algorithm. A lot of experiments show that MEC algorithm has failed when the zoom level locating below 10-3 orders of magnitude, and the cluster centers obtained by the algorithm are likely to have consistency clustering. To solve the above problems, this paper introduced the largest center of interval and the scaling factor η to restructure a new objective function, which called the maximum center interval maximum entropy clustering (η-MCS-MEC) algorithm. This algorithm achieved the maximum by adjusting the distance between the center points and controled the division of each cluster by using the scaling factor η effectively, and which avoided the agreement of the clustering centers. Numerical experiments conducting on the UCI standard data sets and artificial data sets show that the proposed algorithm is not sensitive to the changing data and has better robustness.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号