首页> 外文OA文献 >Algorithms of nonlinear document clustering based on fuzzy multiset model
【2h】

Algorithms of nonlinear document clustering based on fuzzy multiset model

机译:基于模糊多集模型的非线性文档聚类算法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Fuzzy multiset is applicable as a model of information retrieval because it has the mathematicalstructure that expresses the number and the degree of attribution of an element simultaneously.Therefore, fuzzy multisets can be used also as a suitable model for document clustering. Thispaper aims at developing clustering algorithms based on a fuzzy multiset model for documentclustering. The standard proximity measure of the cosine correlation is generalized in the multisetmodel, and two nonlinear clustering techniques are applied to the existing clustering methods.One introduces a variable for controlling cluster volume sizes; the other one is a kernel trick usedin support vector machines. Moreover, clustering by competitive learning is also studied. Whenthe kernel trick has been used the classification configuration of data in a high-dimensional featurespace is visualized by self-organizing maps. Two numerical examples, which use an artificial dataand real document data, are shown and effects of the proposed methods are discussed.
机译:模糊多集具有同时表示元素数量和属性程度的数学结构,因此可以用作信息检索模型。因此,模糊多集也可以用作文档聚类的合适模型。本文旨在开发基于模糊多集模型的文档聚类算法。在多集模型中对余弦相关性的标准接近度进行了归纳,将两种非线性聚类技术应用于现有的聚类方法。另一个是支持向量机中使用的内核技巧。此外,还研究了通过竞争学习进行聚类。使用内核技巧后,通过自组织映射可以直观显示高维特征空间中数据的分类配置。给出了两个数值示例,分别使用了人工数据和真实文档数据,并讨论了所提出方法的效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号