首页> 外文会议>International Conference on Cloud Computing and Big Data >Improving Government-Data Learning via Distributed Clustering Analysis
【24h】

Improving Government-Data Learning via Distributed Clustering Analysis

机译:通过分布式聚类分析改善政府数据学习

获取原文

摘要

Clustering analysis is a study which is of great value, and the large-scale government-data needed to be handled by cluster analysis is growing increasingly. Efficient analysis techniques of large-scale data need to be adopted to handle the large-scale data. Traditional model of serial programming has serious scalability shortage, which don't satisfy the need of the large-scale government-data handling for computing and storage resources. Distributed computing technology represented by the MapReduce has good scalability, and can greatly improve the execution efficiency of data-intensive algorithm, and give play to the computing power of compute cluster based on general hardware. Based on the background of "data platform for public petition", it aims to study how to combine the cluster analysis technology with the current massive government-data, extracting useful information from the mass characteristics hidden in the data through the cluster analysis technology, which can provide comprehensive analyse for system managers and decision makers. This paper focus on the study of combining basic distributed clustering algorithm and TF-IDF algorithm, developing the cases feature analysis module based on distributed clustering algorithm. Based on distributed clustering algorithm, according to the information of the cases, do clustering analysis of cases according to its characteristics, and then get several hidden information through serveral decisional result.
机译:聚类分析是一项具有重要价值的研究,并且通过聚类分析处理所需的大规模政府数据越来越多地增长。需要采用大规模数据的高效分析技术来处理大规模数据。串行编程的传统模型具有严重的可扩展性短缺,这不满足需要大规模政府数据处理来计算和存储资源。由MapReduce表示的分布式计算技术具有良好的可扩展性,并且可以大大提高数据密集算法的执行效率,并基于常规硬件发挥计算集群的计算能力。基于“数据申请数据平台”的背景,它旨在研究如何将集群分析技术与当前的大规模政府数据相结合,从集群分析技术中隐藏在数据中隐藏的质量特征中提取有用信息可以为系统管理者和决策者提供全面的分析。本文重点研究基本分布式聚类算法和TF-IDF算法的研究,基于分布式聚类算法的情况开发了案例特征分析模块。基于分布式聚类算法,根据案例的信息,根据其特征进行群体分析,然后通过服务器策略结果获取多个隐藏信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号