首页> 外文期刊>Future generation computer systems >Mining massive datasets by an unsupervised parallel clustering on a GRID: Novel algorithms and case study
【24h】

Mining massive datasets by an unsupervised parallel clustering on a GRID: Novel algorithms and case study

机译:通过GRID上的无监督并行聚类挖掘海量数据集:新算法和案例研究

获取原文
获取原文并翻译 | 示例

摘要

This paper proposes three novel parallel clustering algorithms based on the Kohonen's SOM aiming at preserving the topology of the original dataset for a meaningful visualization of the results and for discovering associations between features of the dataset by topological operations over the clusters. In all these algorithms the data to be clustered are subdivided among the nodes of a GRID. In the first two algorithms each node executes an on-line SOM, whereas in the third algorithm the nodes execute a quasi-batch SOM called MANTRA. The algorithms differ on how the weights computed by the slave nodes are recombined by a master to launch the next epoch of the SOM in the nodes. A proof outline demonstrates the convergence of the proposed parallel SOMs and provides indications on how to select the learning rate to outperform both the sequential SOM and the parallel SOMs available in the literature. A case study dealing with bioinformatics is presented to illustrate that by our parallel SOM we may obtain meaningful clusters in massive data mining applications at a fraction of the time needed by the sequential SOM, and that the obtained classification supports a fruitful knowledge extraction from massive datasets.
机译:本文基于Kohonen的SOM提出了三种新颖的并行聚类算法,旨在保留原始数据集的拓扑结构,以便对结果进行有意义的可视化,并通过对群集的拓扑操作发现数据集特征之间的关联。在所有这些算法中,要聚类的数据在GRID的节点之间细分。在前两种算法中,每个节点执行一个在线SOM,而在第三种算法中,节点执行一个称为MANTRA的准批量SOM。这些算法的不同之处在于,主节点如何重新组合从属节点计算的权重,以在节点中启动SOM的下一个时期。证明大纲说明了所建议的并行SOM的收敛性,并提供了有关如何选择学习速率以胜过文献中可用的顺序SOM和并行SOM的指示。提出了一个有关生物信息学的案例研究,以说明通过我们的并行SOM,我们可以在顺序SOM所需时间的一小部分时间内,在海量数据挖掘应用程序中获得有意义的集群,并且所获得的分类支持从海量数据集中提取卓有成效的知识。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号