A Novel Fuzzy Kernel C-Means Algorithm for Document Clustering

机译：一种新的文档聚类的模糊核C均值算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Fuzzy Kernel C-Means (FKCM) algorithm can improve accuracy significantly compared with classical Fuzzy C-Means algorithms for nonlinear separability, high dimension and clusters with overlaps in input space. Despite of these advantages, several features are subjected to the applications in real world such as local optimal, outliers, the c parameter must be assigned in advance and slow convergence speed. To overcome these disadvantages, Semi-Supervised learning and validity index are employed. Semi-Supervised learning uses limited labeled data to assistant a bulk of unlabeled data. It makes the FKCM avoid drawbacks proposed. The number of cluster will great affect clustering performance. It isn't possible to assume the optimal number of clusters especially to large text corps. Validity function makes it possible to determine the suitable number of cluster in clustering process. Sparse format, scatter and gathering strategy save considerable store space and computation time. Experimental results on the Reuters-21578 benchmark dataset demonstrate that the algorithm proposed is more flexibility and accuracy than the state-of-art FKCM.

机译：与经典的模糊C均值算法相比，模糊K均值（FKCM）算法在非线性可分离性，高维和输入空间重叠的簇方面，可以显着提高精度。尽管具有这些优点，但现实世界中仍会应用一些功能，例如局部最优，离群值，必须预先分配c参数和降低收敛速度。为了克服这些缺点，采用了半监督学习和有效性指标。半监督学习使用有限的标记数据来辅助大量未标记的数据。这使得FKCM避免了所提出的缺点。群集数量将极大地影响群集性能。不可能假设群集的最佳数量，尤其是对于大型文本公司而言。有效性功能使确定聚类过程中适当的聚类数量成为可能。稀疏的格式，分散和收集策略节省了大量的存储空间和计算时间。在Reuters-21578基准数据集上的实验结果表明，所提出的算法比最新的FKCM更具灵活性和准确性。

著录项

来源
《Information Retrieval Technology》|2008年|P.418-423|共6页
会议地点
作者
Yingshun Yin; Xiaobin Zhang; Baojun Miao; Lili Gao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机设备安全;
关键词
text clustering; semi-supervised learning; fuzzy kernel c-means; kernel validity index;

机译：文本聚类;半监督学习;模糊核c均值;核有效性指数;

相似文献

外文文献
中文文献
专利

1. AN UNSUPERVISED KERNEL BASED FUZZY C-MEANS CLUSTERING ALGORITHM WITH KERNEL NORMALISATION [J] . SHANG-MING ZHOU, JOHN Q. GAN International Journal of Computational Intelligence and Applications . 2004,第4期

机译：基于监督的基于核的模糊C-均值聚类算法
2. A Kernel Fuzzy c-Means Clustering-Based Fuzzy Support Vector Machine Algorithm for Classification Problems With Outliers or Noises [J] . Yang X.Zhang G.Lu J.Ma J. Fuzzy Systems, IEEE Transactions on . 2011,第1期

机译：基于核模糊c均值聚类的模糊支持向量机算法用于离群或噪声分类问题
3. Taylor kernel fuzzy C-means clustering algorithm for trust and energy- aware cluster head selection in wireless sensor networks [J] . Augustine Susan, Ananth J. P. Wireless Networks . 2020,第7期

机译：无线传感器网络中信任与能量感知群集的泰勒内核模糊C型群体聚类算法
4. A Novel Fuzzy Kernel C-Means Algorithm for Document Clustering [C] . Yingshun Yin, Xiaobin Zhang, Baojun Miao, Asia Information Retrieval Symposium . 2008

机译：一种用于文档聚类的新型模糊内核C指甲算法
5. A Genetic Algorithm that Exchanges Neighboring Centers for Fuzzy c-Means Clustering. [D] . Chahine, Firas Safwan. 2012

机译：一种遗传算法，可交换相邻中心进行模糊c均值聚类。
6. Differential privacy fuzzy C-means clustering algorithm based on gaussian kernel function [O] . Yaling Zhang, Jin Han 2021

机译：基于高斯内核函数的差分隐私模糊C均值聚类算法
7. Designing RBFNs Structure Using Similarity-Based and Kernel-Based Fuzzy C-Means Clustering Algorithms [O] . Ireneusz Czarnowski, Joanna Jedrzejowicz, Piotr Jedrzejowicz 2021

机译：使用基于相似性和基于内核的模糊C-Meary集群聚类算法设计RBFNS结构
8. Fuzzy Robust Statistics for Application to the Fuzzy c-Means Clustering Algorithm [R] . Kersten, P. R. 1993

机译：模糊稳健统计量在模糊c-均值聚类算法中的应用

A Novel Fuzzy Kernel C-Means Algorithm for Document Clustering

摘要

著录项

相似文献

相关主题

期刊订阅