首页> 外文会议>International parallel processing >Communication and computation patterns of large scale image convolutions on parallel architectures
【24h】

Communication and computation patterns of large scale image convolutions on parallel architectures

机译:并行架构上大规模图像卷积的通信和计算模式

获取原文
获取外文期刊封面目录资料

摘要

Segmentation and other image processing operations rely on convolution calculations with heavy computational and memory access demands. The article presents an analysis of a texture segmentation application containing a 96/spl times/96 convolution. Sequential execution required several hours an single processor systems with over 99% of the time spent performing the large convolution. 70% to 75% of execution time is attributable to cache misses within the convolution. We implemented the same application on CM-5, iPSC/860 and PVM distributed memory multicomputers, tailoring the parallel algorithms to each machine's architecture. Parallelization significantly reduced execution time, taking 49 seconds on a 512 node CM-5 and 6.5 minutes on a 32 node iPSC/860. The results indicate for large kernel convolutions the size and bandwidth of the fast memory store is more important than processor power or communication overhead.
机译:分割和其他图像处理操作依赖于具有繁重计算和内存访问需求的卷积计算。该物品介绍了包含96 / SPL时间/ 96卷积的纹理分割应用程序的分析。顺序执行需要几个小时的单个处理器系统,其中超过99%的时间花费了大卷积。 70%至75%的执行时间可归因于卷积中的缓存未命中。我们在CM-5,IPSC / 860和PVM分布式存储器多电脑上实现了相同的应用,使并行算法定制到每台机器的架构。并行化显着减少了执行时间,在32节点IPSC / 860上在512节点CM-5和6.5分钟上取49秒。结果表明,对于大型内核卷积,快速存储器存储的大小和带宽比处理器电源或通信开销更重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号