...
首页> 外文期刊>International journal of entelligent systems >Algorithms of Nonlinear Document Clustering Based on Fuzzy Multiset Model
【24h】

Algorithms of Nonlinear Document Clustering Based on Fuzzy Multiset Model

机译:基于模糊多集模型的非线性文档聚类算法

获取原文

摘要

Fuzzy multiset is applicable as a model of information retrieval because it has the mathematical structure that expresses the number and the degree of attribution of an element simultaneously. Therefore, fuzzy multisets can be used also as a suitable model for document clustering. This paper aims at developing clustering algorithms based on a fuzzy multiset model for document clustering. The standard proximity measure of the cosine correlation is generalized in the multiset model, and two nonlinear clustering techniques are applied to the existing clustering methods. One introduces a variable for controlling cluster volume sizes; the other one is a kernel trick used in support vector machines. Moreover, clustering by competitive learning is also studied. When the kernel trick has been used the classification configuration of data in a high-dimensional feature space is visualized by self-organizing maps. Two numerical examples, which use an artificial data and real document data, are shown and effects of the proposed methods are discussed.
机译:模糊多集具有数学结构,可以同时表示元素的数量和属性程度,因此可以用作信息检索的模型。因此,模糊多集也可以用作文档聚类的合适模型。本文旨在开发基于模糊多集模型的文档聚类算法。余弦相关性的标准接近度度量在多集模型中得到了概括,并且两种非线性聚类技术被应用于现有的聚类方法。一个引入了用于控制群集卷大小的变量。另一个是支持向量机中使用的内核技巧。此外,还研究了通过竞争学习进行聚类。使用内核技巧后,通过自组织映射可以直观显示高维特征空间中数据的分类配置。给出了两个数值示例,分别使用了人工数据和真实文档数据,并讨论了所提出方法的效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号