首页> 外文期刊>Knowledge-Based Systems >A quality driven Hierarchical Data Divisive Soft Clustering for information retrieval
【24h】

A quality driven Hierarchical Data Divisive Soft Clustering for information retrieval

机译:质量驱动的分层数据划分软聚类,用于信息检索

获取原文
获取原文并翻译 | 示例

摘要

In this paper an adaptive hierarchical fuzzy clustering algorithm is presented, named Hierarchical Data Divisive Soft Clustering (H2D-SC). The main novelty of the proposed algorithm is that it is a quality driven algorithm, since it dynamically evaluates a multi-dimensional quality measure of the clusters to drive the generation of the soft hierarchy. Specifically, it generates a hierarchy in which each node is split into a variable number of sub-nodes, determined by an innovative quality assessment of soft clusters, based on the evaluation of multiple dimensions such as the cluster's cohesion, its cardinality, its mass, and its fuzziness, as well as the partition's entropy. Clusters at the same hierarchical level share a minimum quality value: clusters in the lower levels of the hierarchy have a higher quality; this way more specific clusters (lower level clusters) have a higher quality than more general clusters (upper level clusters). Further, since the algorithm generates a soft partition, a document can belong to several sub-clusters with distinct membership degrees. The proposed algorithm is divisive, and it is based on a combination of a modified bisecting K-Means algorithm with a flat soft clustering algorithm used to partition each node. The paper describes the algorithm and its evaluation on two standard collections.
机译:本文提出了一种自适应分层模糊聚类算法,称为分层数据划分软聚类(H2D-SC)。所提出算法的主要新颖之处在于它是一种质量驱动算法,因为它可以动态评估群集的多维质量度量以驱动软层次的生成。具体来说,它会根据多个维度(例如,集群的凝聚力,基数,质量,质量等),通过创新性的软集群质量评估,确定将每个节点划分为可变数量的子节点的层次结构以及其模糊性以及分区的熵。同一层次级别的群集共享最低质量值:层次结构较低级别的群集具有较高的质量;这样,比起更普通的群集(较高级别的群集),更具体的群集(较低级别的群集)的质量更高。此外,由于该算法生成软分区,因此文档可以属于具有不同成员资格级别的多个子群集。所提出的算法是分裂性的,并且基于改进的二等分K-Means算法与用于划分每个节点的平面软聚类算法的组合。本文介绍了该算法及其在两个标准集合上的评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号