首页> 外文期刊>Concurrency and computation: practice and experience >A clustering algorithm based on theweighted entropy of conditional attributes for mixed data
【24h】

A clustering algorithm based on theweighted entropy of conditional attributes for mixed data

机译:一种基于混合数据的条件属性熵的聚类算法

获取原文
获取原文并翻译 | 示例

摘要

A novel definition for weighted entropy is proposed to improve clustering performance for small and diverse datasets. First, intra-class and inter-class weighted entropies for categorical and numeric conditional attributes are respectively developed using the mathematical definition of entropy. Second, the weighted entropy is used to calculate cluster weights for mixed conditional attributes. A unique weighted clustering algorithm that adopts entropy as its primary description term, after integrating the corresponding distance calculation mechanism, is then introduced. Finally, a theoretical analysis and validation experiment were conducted using the UC-Irvine dataset. Results showed that the proposed algorithm offers high self-adaptability, as its clustering performance was superior to the existing K-prototypes, SBAC, and OCIL algorithms.
机译:提出了一种重量熵的新定义,以提高小型和多样化数据集的聚类性能。 首先,分别使用熵的数学定义来开发分类和数字条件属性的类内和类别的类别加权熵。 其次,加权熵用于计算混合条件属性的群集权重。 然后引入了一种独特的加权聚类算法,其作为其主要描述术语在集成相应的距离计算机制之后,将熵作为其主要描述术语。 最后,使用UC-Irvine数据集进行了理论分析和验证实验。 结果表明,该算法提供了高自适应,因为其聚类性能优于现有的k原型,SBAC和OCIL算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号