首页> 外文期刊>The Computer journal >Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm
【24h】

Optimal Text Document Clustering Enabled by Weighed Similarity Oriented Jaya With Grey Wolf Optimization Algorithm

机译:通过灰狼优化算法称重相似性引导的最佳文本文档群集,具有灰狼优化算法

获取原文
获取原文并翻译 | 示例
       

摘要

Owing to scientific development, a variety of challenges present in the field of information retrieval. These challenges are because of the increased usage of large volumes of data. These huge amounts of data are presented from large-scale distributed networks. Centralization of these data to carry out analysis is tricky. There exists a requirement for novel text document clustering algorithms, which overcomes challenges in clustering. The two most important challenges in clustering are clustering accuracy and quality. For this reason, this paper intends to present an ideal clustering model for text document using term frequency-inverse document frequency, which is considered as feature sets. Here, the initial centroid selection is much concentrated which can automatically cluster the text using weighted similarity measure in the proposed clustering process. In fact, the weighted similarity function involves the inter-cluster, and intra-cluster similarity of both ordered and unordered documents, which is used to minimize weighted similarity among the documents. An advanced model for clustering is proposed by the hybrid optimization algorithm, which is the combination of the Jaya Algorithm (JA) and Grey Wolf Algorithm (GWO), and so the proposed algorithm is termed as JA-based GWO. Finally, the performance of the proposed model is verified through a comparative analysis with the state-of-the-art models. The performance analysis exhibits that the proposed model is 96.56% better than genetic algorithm, 99.46% better than particle swarm optimization, 97.09% superior to Dragonfly algorithm, and 96.21 % better than JA for the similarity index. Therefore, the proposed model has confirmed its efficiency through valuable analysis.
机译:由于科学发展,信息检索领域存在各种挑战。这些挑战是因为增加了大量数据的使用增加。这些大量数据是从大规模分布式网络呈现的。这些数据的集中化进行分析是棘手的。存在新颖的文本文档聚类算法的要求,它克服了聚类中的挑战。聚类中最重要的两个挑战是聚类准确性和质量。因此,本文旨在使用术语频率反转文档频率为文本文档提供理想的聚类模型,该频率被视为特征集。在这里,初始质心选择很大集中,可以在所提出的聚类过程中使用加权相似度测量自动聚集文本。实际上,加权相似度函数涉及互排序和无序文档的帧间群集和群集相似性,其用于最小化文档之间的加权相似性。混合优化算法提出了一种群集的高级模型,即Jaya算法(JA)和灰狼算法(GWO)的组合,因此所提出的算法被称为基于JA的GWO。最后,通过与最先进的模型进行比较分析来验证所提出的模型的性能。性能分析表明,拟议的模型优于遗传算法优于96.56%,优于粒子群优化,97.09%优于蜻蜓算法,比JA优于JA的相似性指数优于96.09%。因此,所提出的模型通过有价值的分析证实了其效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号