GPU Acceleration for Simulating Massively Parallel Many-Core Platforms

Raghav Shivani; Ruggiero Martino; Marongiu Andrea; Pinto Christian; Atienza David; Benini Luca

首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >GPU Acceleration for Simulating Massively Parallel Many-Core Platforms

【24h】

GPU Acceleration for Simulating Massively Parallel Many-Core Platforms

机译：用于大规模并行多核平台仿真的GPU加速

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Emerging massively parallel architectures such as a general-purpose processor plus many-core programmable accelerators are creating an increasing demand for novel methods to perform their architectural simulation. Most state-of-the-art simulation technologies are exceedingly slow and the need to model full system many-core architectures adds further to the complexity issues. This paper presents a fast, scalable and parallel simulator, which uses a novel methodology to accelerate the simulation of a many-core coprocessor using GPU platforms. The main idea is to use. The target architecture of the associated . Simulation of many target nodes is mapped to the many hardware-threads available on highly parallel GPU platforms. This paper presents a novel methodology to accelerate the simulation of many-core coprocessors using GPU platforms. We demonstrate the challenges, feasibility and benefits of our idea to use heterogeneous system (CPU and GPU) to simulate future architecture of many-core heterogeneous platforms. The target architecture selected to evaluate our methodology consists of an ARM general purpose CPU coupled withmany-core coprocessor with thousands of simple in-order cores connected in a tile network. This work presents optimization techniques used to parallelize the simulation specifically for acceleration on GPUs. We partition the full system simulation between CPU and GPU, where the target general purpose CPU is simulated on the host CPU, whereas the many-core coprocessor is simulated on the NVIDIA Tesla 2070 GPU platform. Our experiments show performance of up to 50 MIPS when simulating the entire heterogeneous chip, and high scalability with increasing cores on coprocessor.

机译：诸如通用处理器和多核可编程加速器之类的大规模并行架构正在出现，对执行其架构仿真的新颖方法的需求日益增长。大多数最先进的仿真技术都非常慢，并且对整个系统的多核体系结构建模的需求进一步增加了复杂性问题。本文介绍了一种快速，可扩展和并行的模拟器，该模拟器使用一种新颖的方法来加速使用GPU平台的多核协处理器的仿真。主要思想是使用。关联的目标体系结构。许多目标节点的仿真被映射到高度并行GPU平台上可用的许多硬件线程。本文提出了一种新颖的方法，可以使用GPU平台加速对多核协处理器的仿真。我们展示了使用异构系统（CPU和GPU）来模拟未来的多核异构平台架构的想法，挑战，可行性和益处。选择用来评估我们的方法的目标体系结构包括一个ARM通用CPU和一个多核协处理器，该协处理器具有连接在图块网络中的数千个简单的有序内核。这项工作提出了用于并行化仿真的优化技术，专门用于GPU上的加速。我们在CPU和GPU之间划分整个系统仿真，其中目标通用CPU在主机CPU上进行仿真，而多核协处理器在NVIDIA Tesla 2070 GPU平台上进行仿真。我们的实验显示，当仿真整个异构芯片时，性能高达50 MIPS，并且随着协处理器内核的增加，其可扩展性也很高。

著录项

来源
《Parallel and Distributed Systems, IEEE Transactions on》 |2015年第5期|1336-1349|共14页
作者
Raghav Shivani; Ruggiero Martino; Marongiu Andrea; Pinto Christian; Atienza David; Benini Luca;
展开▼
作者单位

Embedded Systems Laboratory, ??cole Polytechnique F??d??rale De Lausanne, Lausanne 1015, Vaud, Switzerland;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Computational modeling; Computer architecture; Coprocessors; Graphics processing units; Instruction sets; Scalability; Synchronization; CUDA; GPGPU; Parallel simulation; QEMU; accelerators; heterogeneous architectures; many-core processors;

机译：计算建模;计算机体系结构;协处理器;图形处理单元;指令集;可扩展性;同步;CUDA;GPGPU;并行仿真;QEMU;加速器;异构体系结构;多核处理器;

相似文献

外文文献
中文文献
专利

1. Tsunami: massively parallel homomorphic hashing on many-core GPUs [J] . Xiaowen Chu, Kaiyong Zhao, Zongpeng Li Concurrency and computation: practice and experience . 2012,第17期

机译：海啸：多核GPU上的大规模并行同态哈希
2. Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks on Many-Core Processors and GPUs [J] . Giuseppe Cerati, Peter Elmer, Slava Krutelyov, EPJ Web of Conferences . 2017,第12期

机译：多核处理器和GPU上基于并行卡尔曼滤波器的粒子轨迹重构
3. Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks on Many-Core Processors and GPUs [J] . Giuseppe Cerati, Peter Elmer, Slava Krutelyov, EPJ Web of Conferences . 2017,第13期

机译：多核处理器和GPU上基于并行卡尔曼滤波器的粒子轨迹重构
4. Parallelization strategies of the canny edge detector for multi-core CPUs and many-core GPUs [C] . Ben Cheikh Taieb Lamine, Beltrame Giovanni, Nicolescu Gabriela, 10th IEEE International New Circuits and Systems Conference. . 2012

机译：Canny Edge检测器的并行化策略，用于多核CPU和多核GPU
5. Parallelization framework for scientific application kernels on multi-core/many-core platforms. [D] . Peng, Liu. 2011

机译：多核/多核平台上科学应用程序内核的并行化框架。
6. An open architecture for the massively parallel emulation of the Drosophila brain on multiple GPUs [O] . Lev E Givon, Aurel A Lazar 1936

机译：一种开放式架构，可在多个GPU上对果蝇大脑进行大规模并行仿真
7. GPU Acceleration for Simulating Massively Parallel Many-core Platforms [O] . Shivani Raghav, Martino Ruggiero, Andrea Marongiu, 2015

机译：用于模拟大规模并行多核平台的GpU加速

GPU Acceleration for Simulating Massively Parallel Many-Core Platforms

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅