A Genetic Niching Algorithm with Self-Adaptating Operator Rates for Document Clustering

机译：具有自适应算子速率的遗传小生境文档聚类算法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We propose a Genetic algorithm for document clustering, where an evolutionary multimodal optimization algorithm evolves candidate cluster representative solutions to search for dense regions in the sparse high dimensional vector space of text documents. The evolution affects not only the document cluster representatives but also the genetic operator rates which are evolved simultaneously with the document cluster representative solutions. The evolving population consists of candidate document cluster representatives that are encoded in the form of a sparse index and sparse index/frequency variable length vectors. In addition, specialized sparse genetic operators are defined for this special representation. The proposed specialized genetic operators achieve different degrees of exploitation and exploration in searching for the optimal document cluster prototypes, in particular the most specialized operator for the document clustering problem is the Sparse Top-K-Addition operator, which can be seen as an incentive towards a more aggressive exploitation of the local context in a small subset of documents, whereas the simple Sparse Real Addition operator works more in an exploratory manner. As shown in our experiments on two well-known document data sets, taking into account associated terms within a local context adds the benefit of an explicit latent semantic consideration in the search for optimal term lists to describe the cluster prototypes.

机译：我们提出了一种用于文档聚类的遗传算法，其中进化多模态优化算法演化了候选聚类代表解以在文本文档的稀疏高维向量空间中搜索密集区域。演化不仅影响文档簇代表，而且影响与文档簇代表解决方案同时演化的遗传算子速率。不断发展的总体由候选文档聚类代表组成，这些候选聚类代表以稀疏索引和稀疏索引/频率可变长度向量的形式进行编码。此外，为此特殊表示定义了专门的稀疏遗传运算符。拟议的专业遗传算子在寻找最佳文档簇原型时实现了不同程度的开发和探索，尤其是针对文档聚类问题的最专业算子是稀疏Top-K加法算子，可以看作是对在较小的文档子集中更积极地利用本地上下文，而简单的稀疏实加法运算符以探索性方式工作。如我们在两个众所周知的文档数据集上的实验所示，在本地上下文中考虑相关术语会在寻找最佳术语列表以描述聚类原型时增加显式潜在语义考虑的好处。

著录项

来源
《2012 Eighth Latin American Web Congress.》|2012年|p.79- 86|共8页
会议地点 Cartagena de Indias(CO);Cartagena de Indias(CO)
作者
Leon Elizabeth; Gomez Jonatan; Nasraoui Olfa;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机网络;计算机网络;
关键词

相似文献

外文文献
中文文献
专利

1. Fast global optimization of SixHy clusters: new mutation operators in the cluster genetic algorithm [J] . Ge YB, Head JD Chemical Physics Letters . 2004,第1a3期

机译：SixHy群集的快速全局优化：群集遗传算法中的新变异算子
2. A robust dynamic niching genetic algorithm with niche migration for automatic clustering problem [J] . Chang DX, Zhang XD, Zheng CW, Pattern Recognition: The Journal of the Pattern Recognition Society . 2010,第4期

机译：带有小生境迁移的鲁棒动态小生境遗传算法，用于自动聚类问题
3. A Novel Document Clustering Algorithm Using Squared Distance Optimization Through Genetic Algorithms [J] . Harish Verma, Eatesh Kandpal, Bipul Pandey, International Journal on Computer Science and Engineering . 2010,第5期

机译：遗传算法平方距离优化的文档聚类新算法
4. A Genetic Niching Algorithm with Self-Adaptating Operator Rates for Document Clustering [C] . Leon Elizabeth, Gomez Jonatan, Nasraoui Olfa Latin American Web Conference . 2012

机译：一种具有自适应操作员群体的遗传占算法，用于文档聚类
5. Clustering-Based Genetic Algorithm Initialization and Crossover Operators for Market-Driven Design [D] . Yan, Yuchen. 2020

机译：基于聚类的遗传算法用于市场驱动设计的初始化与交叉运算符
6. A Modified Genetic Algorithm with Local Search Strategies and Multi-Crossover Operator for Job Shop Scheduling Problem [O] . Monique Simplicio Viana, Orides Morandin Junior, Rodrigo Colnago Contreras 2020

机译：具有本地搜索策略和作业商店调度问题的多交叉运算符的修改遗传算法
7. Comparison of Document Clustering algorithm using Genetic Algorithms by Individual Structures [O] . Lim-Cheon Choi, Wei Song, Soon-Cheol Park 2011

机译：各个结构遗传算法的文档聚类算法比较

A Genetic Niching Algorithm with Self-Adaptating Operator Rates for Document Clustering

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅