Document clustering around weighted-medoids

机译：围绕加权中心的文档聚类

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a new similarity-based k-partitions clustering approach, called CAWP. Given the similarities of pairs of objects in the dataset, CAWP groups these objects into K non-overlaped clusters. Each cluster is represented by multiple objects with different weights, called prototype weight. The more representative an object is with respect to a cluster, the larger prototype weight is assigned to that object in the corresponding cluster. Compared with the traditional k-medoids approach, where each cluster is represented by a single medoid or representative object, the way of using prototype weights to allow multiple objects together to describe a cluster is more appropriate in our view. Experimental study using large document datasets show that CAWP is more favorable than other existing similarity-based clustering approaches as it achieves both good effectiveness and efficiency.

机译：在本文中，我们提出了一种新的基于相似度的k分区聚类方法，称为CAWP。给定数据集中对象对的相似性，CAWP将这些对象分组为K个非重叠群集。每个簇由具有不同权重的多个对象表示，称为原型权重。对象相对于群集的代表性越强，则在相应群集中对该对象分配的原型权重就越大。与传统的k-medoids方法（每个聚类由单个medoid或代表性对象表示）相比，在我们看来，使用原型权重允许多个对象一起描述一个聚类的方法更为合适。使用大型文档数据集进行的实验研究表明，CAWP比其他现有的基于相似度的聚类方法更具优势，因为它既具有良好的效果，又具有较高的效率。

著录项

来源
《Information, Communications and Signal Processing (ICICS) 2011 8th International Conference on》|2011年|p.1- 5|共5页
会议地点 Singapore(SG)
作者
Jian-Ping Mei; Lihui Chen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类通信;
关键词

相似文献

外文文献
中文文献
专利

1. DIC-DOC-K-means: Dissimilarity-based Initial Centroid selection for DOCument clustering using K-means for improving the effectiveness of text document clustering [J] . Lakshmi R., Baskar S. Journal of Information Science . 2019,第6期

机译：DIC-DOC-K-means：使用K-means的DOCument聚类基于不相似性的初始质心选择，以提高文本文档聚类的效率
2. An Approach to Improve Quality of Document Clustering by Word Set Based Documenting Clustering Algorithm [J] . Sandeep Sharma, Ruchi Dave, Naveen Hemrajani Oriental journal of computer science and technology . 2011,第2期

机译：基于词集的文档聚类算法提高文档聚类质量的方法
3. Using cluster validation criterion to identify optimal feature subset and cluster number for document clustering [J] . Zheng-Yu Niu, Dong-Hong Ji, Chew Lim Tan Information Processing & Management . 2007,第3期

机译：使用聚类验证标准来识别文档聚类的最佳特征子集和聚类编号
4. Document clustering around weighted-medoids [C] . Jian-Ping Mei, Lihui Chen International Conference on Information, Communications and Signal Processing . 2011

机译：围绕加权 - 麦细群体的文档聚类
5. Text document topical recursive clustering and automatic labeling of a hierarchy of document clusters. [D] . Li, Xiaoxiao. 2012

机译：文本文档主题递归群集和文档群集层次结构的自动标记。
6. Swarm Intelligence Algorithms in Text Document Clustering with Various Benchmarks [O] . Suganya Selvaraj, Eunmi Choi 2021

机译：文本文档集群中的群智能算法与各种基准
7. Text Document Topical Recursive Clustering and Automatic Labeling of a Hierarchy of Document Clusters [O] . Xiaoxiao Li, Jiyang Chen, Osmar Zaiane 2013

机译：文本文档主题递归聚类和文档聚类层次结构的自动标记

Document clustering around weighted-medoids

摘要

著录项

相似文献

相关主题

期刊订阅