首页> 外文期刊>IEICE transactions on information and systems >Optimal Parallel Algorithms for Computing the Sum, the Prefix-Sums, and the Summed Area Table on the Memory Machine Models
【24h】

Optimal Parallel Algorithms for Computing the Sum, the Prefix-Sums, and the Summed Area Table on the Memory Machine Models

机译:Optimal Parallel Algorithms for Computing the Sum, the Prefix-Sums, and the Summed Area Table on the Memory Machine Models

获取原文
获取原文并翻译 | 示例
           

摘要

The main contribution of this paper is to show optimal parallel algorithms to compute the sum, the prefix-sums, and the summed area table on two memory machine models, the Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM). The DMM and the UMM are theoretical parallel computing models that capture the essence of the shared memory and the global memory of GPUs. These models have three parameters, the number p of threads, and the width w of the memory, and the memory access latency l. We first show that the sum of n numbers can be computed in O(n/w + nl/p + l log n) time units on the DMM and the UMM. We then go on to show that Ω(n/w + nl/p + l log n) time units are necessary to compute the sum. We also present a parallel algorithm that computes the prefix-sums of n numbers in O(n/w + nl/p + l log n) time units on the DMM and the UMM. Finally, we show that the summed area table of size n~(1/2)× n~(1/2)× can be computed in O(n/w + j + l log n) time units on the DMM and the UMM. Since the computation of the prefix-sums and the summed area table is at least as hard as the sum computation, these parallel algorithms are also optimal.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号