...
首页> 外文期刊>Journal of Parallel and Distributed Computing >Designing an efficient parallel spectral clustering algorithm on multi-core processors in Julia
【24h】

Designing an efficient parallel spectral clustering algorithm on multi-core processors in Julia

机译:在朱莉娅的多核处理器上设计高效并行谱聚类算法

获取原文
获取原文并翻译 | 示例
           

摘要

Spectral clustering is widely used in data mining, machine learning and other fields. It can identify the arbitrary shape of a sample space and converge to the global optimal solution. Compared with the traditional k-means algorithm, the spectral clustering algorithm has stronger adaptability to data and better clustering results. However, the computation of the algorithm is quite expensive. In this paper, an efficient parallel spectral clustering algorithm on multi-core processors in the Julia language is proposed, and we refer to it as juPSC. The Julia language is a high-performance, open-source programming language. The juPSC is composed of three procedures: (1) calculating the affinity matrix, (2) calculating the eigenvectors, and (3) conducting fc-means clustering. Procedures (1) and (3) are computed by the efficient parallel algorithm, and the COO format is used to compress the affinity matrix. Two groups of experiments are conducted to verify the accuracy and efficiency of the juPSC. Experimental results indicate that (1) the juPSC achieves speedups of approximately 14×~ 18× on a 24-core CPU and that (2) the serial version of the juPSC is faster than the Python version of scikit-learn. Moreover, the structure and functions of the juPSC are designed considering modularity, which is convenient for combination and further optimization with other parallel computing platforms.
机译:光谱聚类广泛用于数据挖掘,机器学习和其他领域。它可以识别采样空间的任意形状并收敛到全局最佳解决方案。与传统的K-Means算法相比,光谱聚类算法对数据和更好的聚类结果具有更强的适应性。但是,算法的计算非常昂贵。在本文中,提出了朱莉娅语言中的多核处理器的有效并行谱聚类算法,并将其称为Jupsc。 Julia语言是一种高性能的开源编程语言。 Jupsc由三个过程组成:(1)计算用于计算特征向量的亲和矩阵,(2),以及(3)进行FC-MEATEL聚类。步骤(1)和(3)通过有效的并行算法计算,并且COO格式用于压缩亲和矩阵。进行两组实验以验证Jupsc的准确性和效率。实验结果表明,(1)jupsc在24核CPU上实现了大约14×〜18倍的加速,并且(2)Jupsc的串行版本比Scikit-Learn的Python版本更快。此外,考虑到模块化设计了Jupsc的结构和功能,这方便组合和与其他并行计算平台进一步优化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号