首页> 外文会议>International Conference on Computational Science >Analysis of the Construction of Similarity Matrices on Multi-core and Many-Core Platforms Using Different Similarity Metrics
【24h】

Analysis of the Construction of Similarity Matrices on Multi-core and Many-Core Platforms Using Different Similarity Metrics

机译:使用不同的相似度度量在多核和多核平台上构建相似度矩阵的分析

获取原文

摘要

Similarity matrices are 2D representations of the degree of similarity between points of a given dataset which are employed in different fields such as data mining, genetics or machine learning. However, their calculation presents quadratic complexity and, thus, it is specially expensive for large datasets. MPICorMat is able to accelerate the construction of these matrices through the use of a hybrid paralleliza-tion strategy based on MPI and OpenMP. The previous version of this tool achieved high performance and scalability, but it only implemented one single similarity metric, the Pearson's correlation. Therefore, it was suitable only for those problems where data are normally distributed and there is a linear relationship between variables. In this work, we present an extension to MPICorMat that incorporates eight additional metrics for similarity so that the users can choose the one that best adapts to their problem. The performance and energy consumption of each metric is measured in two platforms: a multi-core platform with two Intel Xeon Sandy-Bridge processors and a many-core Intel Xeon Phi KNL. Results show that MPICorMat executes faster and consumes less energy on the many-core architecture. The new version of MPICorMat is publicly available to download from its website: https://sourceforge. net/ projects / m picormat /
机译:相似度矩阵是给定数据集的点之间相似度的2D表示,用于不同领域,例如数据挖掘,遗传学或机器学习。但是,它们的计算呈现二次复杂度,因此,对于大型数据集而言,它特别昂贵。 MPICorMat能够通过使用基于MPI和OpenMP的混合并行化策略来加速这些矩阵的构建。该工具的先前版本实现了高性能和可伸缩性,但是仅实现了一个相似性度量标准,即Pearson的相关性。因此,它仅适用于数据正态分布且变量之间存在线性关系的那些问题。在这项工作中,我们提出了MPICorMat的扩展,其中包含八个额外的相似度度量标准,以便用户可以选择最适合其问题的度量标准。每个指标的性能和能耗在两个平台上进行测量:一个带有两个Intel Xeon Sandy-Bridge处理器的多核平台和一个多核Intel Xeon Phi KNL。结果表明,MPICorMat在多核体系结构上执行速度更快,消耗的能源更少。 MPICorMat的新版本可从其网站https:// sourceforge上公开下载。网路/专案/米picormat /

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号