首页> 美国卫生研究院文献>Sensors (Basel Switzerland) >Swarm Intelligence Algorithms in Text Document Clustering with Various Benchmarks
【2h】

Swarm Intelligence Algorithms in Text Document Clustering with Various Benchmarks

机译:文本文档集群中的群智能算法与各种基准

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Text document clustering refers to the unsupervised classification of textual documents into clusters based on content similarity and can be applied in applications such as search optimization and extracting hidden information from data generated by IoT sensors. Swarm intelligence (SI) algorithms use stochastic and heuristic principles that include simple and unintelligent individuals that follow some simple rules to accomplish very complex tasks. By mapping features of problems to parameters of SI algorithms, SI algorithms can achieve solutions in a flexible, robust, decentralized, and self-organized manner. Compared to traditional clustering algorithms, these solving mechanisms make swarm algorithms suitable for resolving complex document clustering problems. However, each SI algorithm shows a different performance based on its own strengths and weaknesses. In this paper, to find the best performing SI algorithm in text document clustering, we performed a comparative study for the PSO, bat, grey wolf optimization (GWO), and K-means algorithms using six data sets of various sizes, which were created from BBC Sport news and 20 newsgroups. Based on our experimental results, we discuss the features of a document clustering problem with the nature of SI algorithms and conclude that the PSO and GWO SI algorithms are better than K-means, and among those algorithms, the PSO performs best in terms of finding the optimal solution.
机译:文本文档群集是指基于内容相似性的无监督文本文档分类为群集,并且可以应用于搜索优化和从IOT传感器生成的数据中提取隐藏信息的应用程序中。群体智能(SI)算法使用随机和启发式原则,包括简单和疏忽的个人,这些原则遵循一些简单的规则来实现非常复杂的任务。通过映射到SI算法参数的问题的特征,SI算法可以以灵活,坚固,分散和自组织方式实现解决方案。与传统聚类算法相比,这些解决机制使得适用于解决复杂文档聚类问题的群体算法。然而,每个SI算法基于其自身的优点和缺点来显示不同的性能。在本文中,为了在文本文档聚类中找到最佳性能的SI算法,我们对使用六种数据集的PSO,BAT,灰狼优化(GWO)和K均值算法进行了比较研究来自BBC Sport News和20个新闻组。基于我们的实验结果,我们讨论了SI算法的性质的文档聚类问题的特征,并得出结论,PSO和GWO SI算法优于K-Means,并且在这些算法中,PSO在找到方面表现最佳最佳解决方案。

著录项

  • 期刊名称 Sensors (Basel Switzerland)
  • 作者

    Suganya Selvaraj; Eunmi Choi;

  • 作者单位
  • 年(卷),期 2021(21),9
  • 年度 2021
  • 页码 3196
  • 总页数 18
  • 原文格式 PDF
  • 正文语种
  • 中图分类
  • 关键词

    机译:群体智能算法;文本文档聚类;人工智能;数据挖掘;
  • 入库时间 2022-08-21 12:28:29

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号