首页> 外文期刊>International journal of parallel programming >Parallel Asynchronous Strategies for the Execution of Feature Selection Algorithms
【24h】

Parallel Asynchronous Strategies for the Execution of Feature Selection Algorithms

机译:特征选择算法执行的并行异步策略

获取原文
获取原文并翻译 | 示例
           

摘要

Abstract Reducing the dimensionality of datasets is a fundamental step in the task of building a classification model. Feature selection is the process of selecting a smaller subset of features from the original one in order to enhance the performance of the classification model. The problem is known to be NP-hard, and despite the existence of several algorithms there is not one that outperforms the others in all scenarios. Due to the complexity of the problem usually feature selection algorithms have to compromise the quality of their solutions in order to execute in a practicable amount of time. Parallel computing techniques emerge as a potential solution to tackle this problem. There are several approaches that already execute feature selection in parallel resorting to synchronous models. These are preferred due to their simplicity and capability to use with any feature selection algorithm. However, synchronous models implement pausing points during the execution flow, which decrease the parallel performance. In this paper, we discuss the challenges of executing feature selection algorithms in parallel using asynchronous models, and present a feature selection algorithm that favours these models. Furthermore, we present two strategies for an asynchronous parallel execution not only of our algorithm but of any other feature selection approach. The first strategy solves the problem using the distributed memory paradigm, while the second exploits the use of shared memory. We evaluate the parallel performance of our strategies using up to 32 cores. The results show near linear speedups for both strategies, with the shared memory strategy outperforming the distributed one. Additionally, we provide an example of adapting our strategies to execute the Sequential forward Search asynchronously. We further test this version versus a synchronous one. Our results revealed that, by using an asynchronous strategy, we are able to save an average of 7.5% of the execution time.
机译:摘要减少数据集的维数是构建分类模型的基本步骤。特征选择是从原始特征中选择较小子集的过程,以增强分类模型的性能。已知此问题是NP难题,尽管存在多种算法,但在所有情况下都没有一种算法能胜过其他算法。由于问题的复杂性,特征选择算法通常必须折衷其解决方案的质量,以便在可行的时间内执行。并行计算技术正在成为解决此问题的潜在解决方案。有几种方法已经可以通过同步模型并行执行特征选择。由于它们的简单性以及与任何特征选择算法一起使用的能力,因此首选它们。但是,同步模型在执行流程中实现了暂停点,从而降低了并行性能。在本文中,我们讨论了使用异步模型并行执行特征选择算法的挑战,并提出了一种有利于这些模型的特征选择算法。此外,我们提出了两种异步并行执行策略,这些策略不仅可以执行我们的算法,还可以执行任何其他特征选择方法。第一种策略使用分布式内存范例解决了该问题,而第二种策略则使用了共享内存。我们使用多达32个内核来评估我们策略的并行性能。结果表明,两种策略的线性加速均接近线性,共享内存策略的性能优于分布式策略。此外,我们提供了一个调整策略以异步执行顺序正向搜索的示例。我们将这个版本与同步版本进行了进一步的测试。我们的结果表明,通过使用异步策略,我们平均可以节省7.5%的执行时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号