首页> 外文期刊>American Journal of Information Systems >Adapt Clustering Methods for Arabic Documents
【24h】

Adapt Clustering Methods for Arabic Documents

机译:适应阿拉伯文件的聚类方法

获取原文
       

摘要

This research paper develops new clustering method (FWC) and further proposes a new approach to filtering data collected from internet resources. The focus of this research paper is clustering groups’ data instances into subsets in such a manner that similar instances are grouped together, while different instances belong to different groups. The instances are thereby organized into an efficient representation that characterizes the population being sampled thereby reducing the gigantic size of retrieved data. This has been done by removing dissimilar text files, and grouping similar documents into homogeneous clusters. Arabic text files of 974 MB has been collected, processed,?analyzed and filtered by using common clustering methods. This new clustering methods are presented, divided into: hierarchical, partitioning, density-based, model-based and soft-computing methods. Following the methods, the challenges of performing clustering in large data sets are discussed and tested by the proposed new clustering method. Two experiments were conducted to establish the effectiveness of FWC methods and the obtained results show that the new FCW method suggested in this paper produced better results and outperformed existing clustering methods.
机译:本研究论文开发了一种新的聚类方法(FWC),并进一步提出了一种新的方法来过滤从Internet资源收集的数据。本研究论文的重点是将组的数据实例聚类为子集,以使相似的实例分组在一起,而不同的实例属于不同的组。因此,实例被组织成一个有效的表示形式,该表示形式表征了要采样的总体,从而减小了所检索数据的巨大规模。这是通过删除不相似的文本文件,并将相似的文档分组为同类的簇来完成的。 974 MB的阿拉伯文本文件已通过使用常见的聚类方法进行收集,处理,分析和过滤。提出了这种新的聚类方法,分为:分层,分区,基于密度,基于模型和软计算的方法。遵循这些方法,通过提出的新聚类方法讨论并测试了在大数据集中执行聚类的挑战。进行了两个实验来确定FWC方法的有效性,所得结果表明,本文提出的新FCW方法产生了更好的结果,并且优于现有的聚类方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号