New Algorithm for Clustering Distributed Data Using k-Means

Khedr Ahmed M.; Bhatnagar Raj K.

首页> 外文期刊>Computing and informatics >New Algorithm for Clustering Distributed Data Using k-Means

【24h】

New Algorithm for Clustering Distributed Data Using k-Means

机译：使用k均值的分布式数据聚类新算法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The internet era and high speed networks have ushered in the capabilities to have ready access to large amounts of geographically distributed data. Individuals, businesses, and governments recognize the value of this available resource to those who can transform the data into information. These databases, though valuable as individual entities, become significantly more valuable when they function as parts of a federated database and their data can be aggregated for collective mining or computations. This requires new algorithms to shift their focus from working with single databases to efficiently working with federated databases. In this paper, we propose a new decomposable version of the popular k-means clustering algorithm that works in this desired manner with a set of networked databases. We show that it is possible to perform global computation in a reasonably secure manner for either horizontally or vertically distributed databases. The computation is completed by only exchanging a few local summaries among the databases. An empirical and analytical validation of our results is also presented.

机译：互联网时代和高速网络已经引入了可以立即访问大量地理分布数据的功能。个人，企业和政府都认识到这种可用资源对那些可以将数据转换为信息的人的价值。这些数据库虽然作为单独的实体有价值，但是当它们充当联合数据库的一部分并且可以将其数据进行汇总以进行集体挖掘或计算时，其价值将大大提高。这就需要新的算法将其重点从使用单个数据库转移到有效使用联邦数据库。在本文中，我们提出了一种流行的k-means聚类算法的新可分解版本，该算法可按期望的方式与一组网络数据库一起工作。我们表明，有可能以合理安全的方式对水平或垂直分布的数据库执行全局计算。通过仅在数据库之间交换一些本地摘要来完成计算。还提供了我们的结果的经验和分析验证。

著录项

来源
《Computing and informatics》 |2014年第4期|共22页
作者
Khedr Ahmed M.; Bhatnagar Raj K.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A hybrid MapReduce-based k-means clustering using genetic algorithm for distributed datasets [J] . Sinha Ankita, Jana Prasanta K. Journal of supercomputing . 2018,第4期

机译：基于遗传算法的基于MapReduce混合k均值聚类的分布式数据集
2. A distributed k-mean clustering algorithm for cloud data mining [J] . Renu Asnani International Journal of Engineering Trends and Technology . 2015,第7期

机译：一种用于云数据挖掘的分布式k均值聚类算法
3. NEW ALGORITHM FOR CLUSTERING DISTRIBUTED DATA USING K-MEANS [J] . Ahmed M. Khedr, Raj K. Bhatnagar Computing and informatics . 2014,第4期

机译：使用K均值聚类分布数据的新算法
4. Distributed k-mean algorithm for data clustering [C] . Her-Kun Chang 8th World Multi-Conference on Systemics, Cybernetics and Informatics(SCI 2004) vol.15: Post-Conference Issue . 2004

机译：分布式k均值数据聚类算法
5. Clustering educational digital library usage data: Comparisons of latent class analysis and K-means algorithms [D] . Xu, Beijie 2011

机译：聚集教育数字图书馆使用数据：潜在类别分析和K-means算法的比较
6. Balancing effort and benefit of K-means clustering algorithms in Big Data realms [O] . Joaquín Pérez-Ortega, Nelva Nely Almanza-Ortega, David Romero 2012

机译：大数据领域中K均值聚类算法的平衡工作和收益
7. Warped K-Means: An algorithm to cluster sequentially-distributed data [O] . Luis A. Leiva, Enrique Vidal 2013

机译：翘曲k均值：串行分布式数据的算法

New Algorithm for Clustering Distributed Data Using k-Means

摘要

著录项

相似文献

相关主题

期刊订阅