Programming NVIDIA cards by means of transitive closure based parallelization algorithms

Marek PALKOWSKI; Wlodzimierz BIELECKI

首页> 外文期刊>Przeglad Elektrotechniczny >Programming NVIDIA cards by means of transitive closure based parallelization algorithms

【24h】

Programming NVIDIA cards by means of transitive closure based parallelization algorithms

机译：通过基于传递闭包的并行化算法对NVIDIA卡进行编程

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Przetwarzanie równoległe na wielką skalę wykonywane jest za pomocą wielu procesorów (również graficznych) wykonujących jednocześnie instrukcje pojedynczego programu. Ponieważ większość obliczeń zlokalizowana jest w pętlach programowych, automatyczne zrównoleglanie kodu jest ważne dla maszyn wielordzeniowych. W artykule zbadano przyspieszenie i skalowalność równoległego kodu złożonego z niezależnych fragmentów lub harmonogramowania swobodnego za pomocą maszyny Tesla S1070 zbudowanej z 960 rdzeni CUDA.%Massively parallel processing is a type of computing that uses many separate CPUs or GPUs running in parallel to execute a single program. Because most computations are contained in program loops, automatic extraction of parallelism available in loops is extremely important for many-core systems. In this paper, we study speed-up and scalability of parallel code scanning synchronization-free slices and time partitions by means of a 960 CUDA Cores machine, Tesla S1070.

机译：大规模并行处理是在许多处理器（也是图形处理器）的帮助下执行的，这些处理器同时执行单个程序的指令。由于大多数计算都位于程序循环中，因此自动代码并行化对于多核计算机非常重要。本文使用由960个CUDA内核构建的Tesla S1070计算机检查了由独立片段或自由调度组成的并行代码的加速性和可伸缩性。％大规模并行处理是一种计算类型，其使用许多并行运行的独立CPU或GPU来执行单个程序。因为大多数计算都包含在程序循环中，所以对于多核系统而言，自动提取循环中可用的并行性极为重要。在本文中，我们通过960 CUDA Cores机器Tesla S1070研究了并行代码扫描的无同步切片和时间分区的加速和可扩展性。

著录项

来源
《Przeglad Elektrotechniczny》 |2012年第10b期|p.217-222|共6页
作者
Marek PALKOWSKI; Wlodzimierz BIELECKI;
展开▼
作者单位

Zachodniopomorski Uniwersytet Technologiczny, Katedra Inzynierii Oprogramowania, ul. Zolnierska 49, 71-210 Szczecin;

Zachodniopomorski Uniwersytet Technologiczny, Katedra Inzynierii Oprogramowania, ul. Zolnierska 49, 71-210 Szczecin;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
parallel program loops; many-core machines; synchronization-free slicing; free-scheduling;

机译：并行程序循环;多核机器;无同步切片;自由安排;
入库时间 2022-08-18 00:54:20

相似文献

外文文献
中文文献
专利

1. Using transitive closure and transitive reduction to extract coarse-grained parallelism in program loops [J] . W?odzimierz BIELECKI, rnMarek PALKOWSKI, rnKrzysztof SIEDLECKI Pomiary Automatyka Kontrola . 2010,第8期

机译：使用传递闭包和传递归约来提取程序循环中的粗粒度并行度
2. Efficient Implementation Of The Italiano Algorithms For Updating The Transitive Closure On Associative Parallel Processors [J] . Anna Nepomniaschaya Fundamenta Informaticae . 2008,第2a3期

机译：在关联并行处理器上更新传递闭包的Italiano算法的有效实现
3. Parallel transitive closure algorithm [J] . C. E. R. Alves, E. N. Cáceres, A. A. de Castro, Brazilian Computer Society. Journal . 2013,第2期

机译：并行传递闭合算法
4. Parallel transitive closure and transitive reduction algorithms [C] . Chang P., Henschen L.J. Solid-State Circuits Conference, 1996. ISSCC . 1996

机译：并行传递闭包和传递约简算法
5. GRAph Parallel Actor Language --- A Programming Language for Parallel Graph Algorithms. [D] . DeLorimier, Michael. 2013

机译：GRAPH并行Actor语言---一种用于并行图算法的编程语言。
6. Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets [O] . D. D. Shrimankar, S. R. Sathe 2016

机译：大型生物数据集基于新图块的并行编程模型对SMP节点和工作站集群的并行算法进行分析
7. A parallel transitive closure algorithm using hash-based clustering [O] . Cheiney, Jean-Pierre, De Maindreville, Christophe 1988

机译：基于散列聚类的并行传递闭包算法
8. Fast parallel algorithms that compute transitive closure of a fuzzy relation [R] . Kreinovich, Vladik YA. 1993

机译：计算模糊关系传递闭包的快速并行算法

Programming NVIDIA cards by means of transitive closure based parallelization algorithms

摘要

著录项

相似文献

相关主题

期刊订阅