首页> 中文期刊>中国通信 >Distributed Document Clustering Analysis Based on a Hybrid Method

Distributed Document Clustering Analysis Based on a Hybrid Method

     

摘要

Clustering is one of the recently challenging tasks since there is an ever-growing amount of data in scientific research and commercial applications.High quality and fast document clustering algorithms are in great demand to deal with large volume of data.The computational requirements for bringing such growing amount data to a central site for clustering are complex.The proposed algorithm uses optimal centroids for K-Means clustering based on Particle Swarm Optimization(PSO).PSO is used to take advantage of its global search ability to provide optimal centroids which aids in generating more compact clusters with improved accuracy.This proposed methodology utilizes Hadoop and MapReduce framework which provides distributed storage and analysis to support data intensive distributed applications.Experiments were performed on Reuter's and RCV1 document dataset which shows an improvement in accuracy with reduced execution time.

著录项

  • 来源
    《中国通信》|2017年第2期|131-142|共12页
  • 作者

    J.E.Judith; J.Jayakumari;

  • 作者单位

    Noorul Islam Centre for Higher Education, Kumaracoil, India.;

    Noorul Islam Centre for Higher Education, Kumaracoil, India.;

  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2023-07-25 20:36:41

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号