...
首页> 外文期刊>IEICE transactions on information and systems >Optimal Parallel Algorithms for Computing the Sum, the Prefix-Sums, and the Summed Area Table on the Memory Machine Models
【24h】

Optimal Parallel Algorithms for Computing the Sum, the Prefix-Sums, and the Summed Area Table on the Memory Machine Models

机译:用于在内存机器模型上计算总和,前缀总和和总面积表的最佳并行算法

获取原文
           

摘要

The main contribution of this paper is to show optimal parallel algorithms to compute the sum, the prefix-sums, and the summed area table on two memory machine models, the Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM). The DMM and the UMM are theoretical parallel computing models that capture the essence of the shared memory and the global memory of GPUs. These models have three parameters, the number p of threads, and the width w of the memory, and the memory access latency l . We first show that the sum of n numbers can be computed in $O({nover w}+{nlover p}+llog n)$ time units on the DMM and the UMM. We then go on to show that $Omega({nover w}+{nlover p}+llog n)$ time units are necessary to compute the sum. We also present a parallel algorithm that computes the prefix-sums of n numbers in $O({nover w}+{nlover p}+llog n)$ time units on the DMM and the UMM. Finally, we show that the summed area table of size $sqrt{n}imessqrt{n}$ can be computed in $O({nover w}+{nlover p}+llog n)$ time units on the DMM and the UMM. Since the computation of the prefix-sums and the summed area table is at least as hard as the sum computation, these parallel algorithms are also optimal.
机译:本文的主要贡献是展示了用于在两种存储机器模型(离散存储机器(DMM)和统一存储机器(UMM))上计算总和,前缀总和和总面积表的最佳并行算法。 DMM和UMM是理论上的并行计算模型,可捕获GPU的共享内存和全局内存的本质。这些模型具有三个参数,线程数 p,内存的宽度 w,以及​​内存访问延迟 1。我们首先显示,可以在DMM和UMM上以$ O({n w} + {nl over p} +1 log n)$个时间单位来计算 n个数的总和。然后,我们继续表明,需要$ Omega({n over w} + {nl over p} +1 log n)$个时间单位来计算总和。我们还提出了一种并行算法,用于计算DMM和UMM上$ O({n over w} + {nl over p} + l log n)$个时间单位的 n个数字的前缀和。最后,我们证明大小为$ sqrt {n} times sqrt {n} $的总面积表可以在$ O({n over w} + {nl over p} + l log n )$ DMM和UMM上的时间单位。由于前缀和和和面积表的计算至少与和计算一样困难,因此这些并行算法也是最佳的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号