首页> 外文会议>Latin American Web Conference >A Genetic Niching Algorithm with Self-Adaptating Operator Rates for Document Clustering

【24h】

A Genetic Niching Algorithm with Self-Adaptating Operator Rates for Document Clustering

机译：一种具有自适应操作员群体的遗传占算法，用于文档聚类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a Genetic algorithm for document clustering, where an evolutionary multimodal optimization algorithm evolves candidate cluster representative solutions to search for dense regions in the sparse high dimensional vector space of text documents. The evolution affects not only the document cluster representatives but also the genetic operator rates which are evolved simultaneously with the document cluster representative solutions. The evolving population consists of candidate document cluster representatives that are encoded in the form of a sparse index and sparse index/frequency variable length vectors. In addition, specialized sparse genetic operators are defined for this special representation. The proposed specialized genetic operators achieve different degrees of exploitation and exploration in searching for the optimal document cluster prototypes, in particular the most specialized operator for the document clustering problem is the Sparse Top-K-Addition operator, which can be seen as an incentive towards a more aggressive exploitation of the local context in a small subset of documents, whereas the simple Sparse Real Addition operator works more in an exploratory manner. As shown in our experiments on two well-known document data sets, taking into account associated terms within a local context adds the benefit of an explicit latent semantic consideration in the search for optimal term lists to describe the cluster prototypes.

机译：我们提出了一种用于文档聚类的遗传算法，其中进化多式化优化算法演变了候选集群代表性解决方案，以搜索文本文档的稀疏高维矢量空间中的密集区域。该进化不仅影响文件集群代表，而且影响了与文件集群代表解决方案同时演变的遗传运营商率。不断发展的人口由候选文档集群代表组成，这些群集代表以稀疏索引和稀疏索引/频率可变长度向量的形式编码。此外，专门的稀疏遗传算子是为此特殊代表定义的。拟议的专业遗传运营商在寻找最佳文档群集原型方面取得了不同程度的开发和探索，特别是最专业的文档聚类问题的操作员是稀疏的Top-k-Afterperator，可以被视为激励在一小部分文件中更积极地利用本地背景，而简单的稀疏实际加法运算符以探索性方式工作。如我们在两个众所周知的文档数据集的实验中所示，考虑到本地上下文中的关联术语增加了在搜索最佳术语列表中的显式潜在语义考虑的益处，以描述群集原型。

著录项

来源
《Latin American Web Conference》|2012年||共8页
会议地点
作者
Leon Elizabeth; Gomez Jonatan; Nasraoui Olfa;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP393-53;
关键词

相似文献

外文文献
中文文献
专利

1. Fast global optimization of SixHy clusters: new mutation operators in the cluster genetic algorithm [J] . Ge YB, Head JD Chemical Physics Letters . 2004,第1a3期

机译：SixHy群集的快速全局优化：群集遗传算法中的新变异算子
2. A robust dynamic niching genetic algorithm with niche migration for automatic clustering problem [J] . Chang DX, Zhang XD, Zheng CW, Pattern Recognition: The Journal of the Pattern Recognition Society . 2010,第4期

机译：带有小生境迁移的鲁棒动态小生境遗传算法，用于自动聚类问题
3. A Novel Document Clustering Algorithm Using Squared Distance Optimization Through Genetic Algorithms [J] . Harish Verma, Eatesh Kandpal, Bipul Pandey, International Journal on Computer Science and Engineering . 2010,第5期

机译：遗传算法平方距离优化的文档聚类新算法
4. A Genetic Niching Algorithm with Self-Adaptating Operator Rates for Document Clustering [C] . Leon Elizabeth, Gomez Jonatan, Nasraoui Olfa 2012 Eighth Latin American Web Congress. . 2012

机译：具有自适应算子速率的遗传小生境文档聚类算法
5. Clustering-Based Genetic Algorithm Initialization and Crossover Operators for Market-Driven Design [D] . Yan, Yuchen. 2020

机译：基于聚类的遗传算法用于市场驱动设计的初始化与交叉运算符
6. A Modified Genetic Algorithm with Local Search Strategies and Multi-Crossover Operator for Job Shop Scheduling Problem [O] . Monique Simplicio Viana, Orides Morandin Junior, Rodrigo Colnago Contreras 2020

机译：具有本地搜索策略和作业商店调度问题的多交叉运算符的修改遗传算法
7. Comparison of Document Clustering algorithm using Genetic Algorithms by Individual Structures [O] . Lim-Cheon Choi, Wei Song, Soon-Cheol Park 2011

机译：各个结构遗传算法的文档聚类算法比较

A Genetic Niching Algorithm with Self-Adaptating Operator Rates for Document Clustering

摘要

著录项

相似文献

相关主题

期刊订阅