【24h】

Document clustering around weighted-medoids

机译:围绕加权中心的文档聚类

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we propose a new similarity-based k-partitions clustering approach, called CAWP. Given the similarities of pairs of objects in the dataset, CAWP groups these objects into K non-overlaped clusters. Each cluster is represented by multiple objects with different weights, called prototype weight. The more representative an object is with respect to a cluster, the larger prototype weight is assigned to that object in the corresponding cluster. Compared with the traditional k-medoids approach, where each cluster is represented by a single medoid or representative object, the way of using prototype weights to allow multiple objects together to describe a cluster is more appropriate in our view. Experimental study using large document datasets show that CAWP is more favorable than other existing similarity-based clustering approaches as it achieves both good effectiveness and efficiency.
机译:在本文中,我们提出了一种新的基于相似度的k分区聚类方法,称为CAWP。给定数据集中对象对的相似性,CAWP将这些对象分组为K个非重叠群集。每个簇由具有不同权重的多个对象表示,称为原型权重。对象相对于群集的代表性越强,则在相应群集中对该对象分配的原型权重就越大。与传统的k-medoids方法(每个聚类由单个medoid或代表性对象表示)相比,在我们看来,使用原型权重允许多个对象一起描述一个聚类的方法更为合适。使用大型文档数据集进行的实验研究表明,CAWP比其他现有的基于相似度的聚类方法更具优势,因为它既具有良好的效果,又具有较高的效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号