k-ANMI: A mutual information based clustering algorithm for categorical data

Zengyou He; Xiaofei Xu; Shengchun Deng

首页> 外文期刊>Information Fusion >k-ANMI: A mutual information based clustering algorithm for categorical data

【24h】

k-ANMI: A mutual information based clustering algorithm for categorical data

机译：k-ANMI：基于互信息的分类数据聚类算法

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering categorical data is an integral part of data mining and has attracted much attention recently. In this paper, we present k-ANMI, a new efficient algorithm for clustering categorical data. The k-ANMI algorithm works in a way that is similar to the popular k-means algorithm, and the goodness of clustering in each step is evaluated using a mutual information based criterion (namely, average normalized mutual information - ANMI) borrowed from cluster ensemble. This algorithm is easy to implement, requiring multiple hash tables as the only major data structure. Experimental results on real datasets show that k-ANMI algorithm is competitive with those state-of-the-art categorical data clustering algorithms with respect to clustering accuracy.

机译：聚类分类数据是数据挖掘不可或缺的一部分，最近引起了很多关注。在本文中，我们提出了k-ANMI，这是一种用于分类数据聚类的新有效算法。 k-ANMI算法的工作方式与流行的k-means算法类似，并且使用从聚类集成中借用的基于互信息的标准（即平均归一化互信息-ANMI）来评估每个步骤中的聚类的优缺点。。该算法易于实现，需要多个哈希表作为唯一的主要数据结构。在真实数据集上的实验结果表明，就聚类准确性而言，k-ANMI算法与那些最新的分类数据聚类算法相比具有竞争优势。

著录项

来源
《Information Fusion 》 |2008年第2期| 共11页
作者
Zengyou He; Xiaofei Xu; Shengchun Deng;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术 ;
关键词
Clustering; Categorical data; Mutual information; Cluster ensemble; Data mining;

机译：聚类;分类数据;共同信息;聚类集成;数据挖掘;

相似文献

外文文献
中文文献
专利

1. k-ANMI: A mutual information based clustering algorithm for categorical data [J] . Zengyou He, Xiaofei Xu, Shengchun Deng Information Fusion . 2008 ,第2期

机译：k-ANMI：基于互信息的分类数据聚类算法
2. Efficient algorithms based on the k-means and Chaotic League Championship Algorithm for numeric, categorical, and mixed-type data clustering [J] . Wangchamhan Tanachapong, Chiewchanwattana Sirapat, Sunat Khamron Expert Systems with Application . 2017 ,第deca30期

机译：基于k均值和混沌联赛冠军算法的高效算法，用于数字，分类和混合类型的数据聚类
3. GACC: genetic algorithm-based categorical data clustering for large datasets [J] . Abha Sharma, R.S. Thakur International journal of data mining, modelling and management . 2017 ,第4期

机译：GACC：用于大型数据集的基于遗传算法的分类数据聚类
4. K-modes Based Categorical Data Clustering Algorithms Satisfying Differential Privacy [C] . Mingshuang Li, Yihui Zhou, Wenru Tang, International Conference on Networking and Network Applications . 2020

机译：基于K-Modes的分类数据聚类算法满足差异隐私
5. Clustering algorithms for categorical data [D] . Andreopoulos, William 2006

机译：分类数据的聚类算法
6. A Novel Artificial Bee Colony Based Clustering Algorithm for Categorical Data [O] . Jinchao Ji, Wei Pang, Yanlin Zheng, -1

机译：一种新的基于人工蜂群的分类数据聚类算法
7. K-ANMI: A Mutual Information Based Clustering Algorithm for Categorical Data [O] . He, Zengyou, Xu, Xiaofei, Deng, Shengchun 2005

机译：K-aNmI：一种基于互信息的分类聚类算法数据

k-ANMI: A mutual information based clustering algorithm for categorical data

摘要

著录项

相似文献

相关主题

期刊订阅