Parallel Asynchronous Strategies for the Execution of Feature Selection Algorithms

Jorge Silva; Ana Aguiar; Fernando Silva

首页> 外文期刊>International journal of parallel programming >Parallel Asynchronous Strategies for the Execution of Feature Selection Algorithms

【24h】

Parallel Asynchronous Strategies for the Execution of Feature Selection Algorithms

机译：特征选择算法执行的并行异步策略

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Abstract Reducing the dimensionality of datasets is a fundamental step in the task of building a classification model. Feature selection is the process of selecting a smaller subset of features from the original one in order to enhance the performance of the classification model. The problem is known to be NP-hard, and despite the existence of several algorithms there is not one that outperforms the others in all scenarios. Due to the complexity of the problem usually feature selection algorithms have to compromise the quality of their solutions in order to execute in a practicable amount of time. Parallel computing techniques emerge as a potential solution to tackle this problem. There are several approaches that already execute feature selection in parallel resorting to synchronous models. These are preferred due to their simplicity and capability to use with any feature selection algorithm. However, synchronous models implement pausing points during the execution flow, which decrease the parallel performance. In this paper, we discuss the challenges of executing feature selection algorithms in parallel using asynchronous models, and present a feature selection algorithm that favours these models. Furthermore, we present two strategies for an asynchronous parallel execution not only of our algorithm but of any other feature selection approach. The first strategy solves the problem using the distributed memory paradigm, while the second exploits the use of shared memory. We evaluate the parallel performance of our strategies using up to 32 cores. The results show near linear speedups for both strategies, with the shared memory strategy outperforming the distributed one. Additionally, we provide an example of adapting our strategies to execute the Sequential forward Search asynchronously. We further test this version versus a synchronous one. Our results revealed that, by using an asynchronous strategy, we are able to save an average of 7.5% of the execution time.

机译：摘要减少数据集的维数是构建分类模型的基本步骤。特征选择是从原始特征中选择较小子集的过程，以增强分类模型的性能。已知此问题是NP难题，尽管存在多种算法，但在所有情况下都没有一种算法能胜过其他算法。由于问题的复杂性，特征选择算法通常必须折衷其解决方案的质量，以便在可行的时间内执行。并行计算技术正在成为解决此问题的潜在解决方案。有几种方法已经可以通过同步模型并行执行特征选择。由于它们的简单性以及与任何特征选择算法一起使用的能力，因此首选它们。但是，同步模型在执行流程中实现了暂停点，从而降低了并行性能。在本文中，我们讨论了使用异步模型并行执行特征选择算法的挑战，并提出了一种有利于这些模型的特征选择算法。此外，我们提出了两种异步并行执行策略，这些策略不仅可以执行我们的算法，还可以执行任何其他特征选择方法。第一种策略使用分布式内存范例解决了该问题，而第二种策略则使用了共享内存。我们使用多达32个内核来评估我们策略的并行性能。结果表明，两种策略的线性加速均接近线性，共享内存策略的性能优于分布式策略。此外，我们提供了一个调整策略以异步执行顺序正向搜索的示例。我们将这个版本与同步版本进行了进一步的测试。我们的结果表明，通过使用异步策略，我们平均可以节省7.5％的执行时间。

著录项

来源
《International journal of parallel programming》 |2018年第2期|252-283|共32页
作者
Jorge Silva; Ana Aguiar; Fernando Silva;
展开▼
作者单位

CRACS/INESC TEC, Faculdade de Ciências, University of Porto;

IT-Porto, Faculdade de Engenharia, University of Porto;

CRACS/INESC TEC, Faculdade de Ciências, University of Porto;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Feature selection; Parallel computing; Machine learning; Asynchronous model;

机译：特征选择并行计算机器学习异步模型;

相似文献

外文文献
中文文献
专利

1. A QUASI-OPTIMAL CLUSTER ALLOCATION STRATEGY FOR PARALLEL PROGRAM EXECUTION IN DISTRIBUTED SYSTEMS USING GENETIC ALGORITHMS [J] . ESQUIVEL S., LEGUIZAMON G., GALLARD R. Operating systems review . 1995,第2期

机译：遗传算法的分布式系统并行程序执行的拟最优集群分配策略
2. New Approaches to Parallelization in Filters Aggregation Based Feature Selection Algorithms [J] . Ivan Smetannikov, Ilya Isaev, Andrey Filchenkov Procedia Computer Science . 2016,第1期

机译：基于过滤器聚合的特征选择算法并行化的新方法
3. Applying an Improved Elephant Herding Optimization Algorithm with Spark-based Parallelization to Feature Selection for Intrusion Detection [J] . Hui Xu, Qianqian Cao, Heng Fu, International Journal of Performability Engineering . 2019,第6期

机译：应用了一种改进的大象放牧优化算法与火花的并行化与入侵检测特征选择
4. DIVIDE: Distributed visual display of the execution of asynchronous, distributed algorithms on loosely-coupled parallel processors [C] . Morrow, T.M., Ghosh, . 1993

机译：划分：松散耦合并行处理器上异步分布式算法执行的分布式可视显示
5. Investigation of Extending Feature Selection Algorithms to Explicit Feature Selection in Kernel Space [D] . Li, Qiaozhi. 2018

机译：核空间中扩展特征选择算法用于显式特征选择的研究
6. An Efficient Parallelized Algorithm for Optimal Conditional Entropy-Based Feature Selection [O] . Gustavo Estrela, Marco Dimas Gubitoso, Carlos Eduardo Ferreira, 2020

机译：基于最佳条件熵的特征选择的高效并行化算法
7. DIVIDE: Distributed Visual Display of the Execution of Asynchronous, Distributed Algorithms on Loosely-Coupled Parallel Processors [O] . Tom M. Morrow, Sumit Ghosh 1993

机译：分开：松耦合并行处理器上异步分布式算法执行的分布式可视显示
8. Distributed Computing for Signal Processing: Modeling of Asynchronous Parallel Computation. Appendix D. Analysis of MIMD (Multiple Instruction Streams, Multiple Data Streams) Algorithms: Features, Measurements, and Results [R] . Smith, K. D. 1984

机译：信号处理的分布式计算：异步并行计算的建模。附录D. mImD（多指令流，多数据流）算法的分析：特征，测量和结果

Parallel Asynchronous Strategies for the Execution of Feature Selection Algorithms

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅