首页> 外文会议>International parallel processing >Communication and computation patterns of large scale image convolutions on parallel architectures

【24h】

Communication and computation patterns of large scale image convolutions on parallel architectures

机译：并行架构上大规模图像卷积的通信和计算模式

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Segmentation and other image processing operations rely on convolution calculations with heavy computational and memory access demands. The article presents an analysis of a texture segmentation application containing a 96/spl times/96 convolution. Sequential execution required several hours an single processor systems with over 99% of the time spent performing the large convolution. 70% to 75% of execution time is attributable to cache misses within the convolution. We implemented the same application on CM-5, iPSC/860 and PVM distributed memory multicomputers, tailoring the parallel algorithms to each machine's architecture. Parallelization significantly reduced execution time, taking 49 seconds on a 512 node CM-5 and 6.5 minutes on a 32 node iPSC/860. The results indicate for large kernel convolutions the size and bandwidth of the fast memory store is more important than processor power or communication overhead.

机译：分割和其他图像处理操作依赖于具有繁重计算和内存访问需求的卷积计算。该物品介绍了包含96 / SPL时间/ 96卷积的纹理分割应用程序的分析。顺序执行需要几个小时的单个处理器系统，其中超过99％的时间花费了大卷积。 70％至75％的执行时间可归因于卷积中的缓存未命中。我们在CM-5，IPSC / 860和PVM分布式存储器多电脑上实现了相同的应用，使并行算法定制到每台机器的架构。并行化显着减少了执行时间，在32节点IPSC / 860上在512节点CM-5和6.5分钟上取49秒。结果表明，对于大型内核卷积，快速存储器存储的大小和带宽比处理器电源或通信开销更重要。

著录项

来源
《International parallel processing》|1994年||共6页
会议地点
作者
Dykes S.G.; Xiaodong Zhang; Institute of Electric and Electronic Engineer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns [J] . Fengbin Tu, Shouyi Yin, Peng Ouyang, IEEE transactions on very large scale integration (VLSI) systems . 2017,第8期

机译：具有可重构计算模式的深度卷积神经网络架构
2. PDCOVIDNet: a parallel-dilated convolutional neural network architecture for detecting COVID-19 from chest X-ray images [J] . Nihad K. Chowdhury, Muhtadir Rahman, Muhammad Ashad Kabir Health Information Science and Systems . 2020,第1期

机译：PDCOVIDNET：一个平行扩张的卷积神经网络架构，用于检测来自胸部X射线图像的Covid-19
3. A Bi-layered Parallel Training Architecture for Large-Scale Convolutional Neural Networks [J] . Chen Jianguo, Li Kenli, Bilal Kashif, IEEE Transactions on Parallel and Distributed Systems . 2019,第5期

机译：大规模卷积神经网络的双层并行训练架构
4. Communication and computation patterns of large scale image convolutions on parallel architectures [C] . Dykes, S.G., Xiaodong Zhang . 1994

机译：并行体系结构上大规模图像卷积的通信和计算模式
5. Scalable parallel computing on clouds: Efficient and scalable architectures to perform pleasingly parallel, MapReduce and iterative data intensive computations on cloud environments. [D] . Gunarathne, Thilina. 2014

机译：云上的可伸缩并行计算：高效且可伸缩的架构，可在云环境上执行令人满意的并行，MapReduce和迭代式数据密集型计算。
6. PDCOVIDNet: a parallel-dilated convolutional neural network architecture for detecting COVID-19 from chest X-ray images [O] . Nihad K. Chowdhury, Md. Muhtadir Rahman, Muhammad Ashad Kabir 2020

机译：PDCOVIDNET：一个平行扩张的卷积神经网络架构用于检测来自胸部X射线图像的Covid-19
7. Communication and Computation Patterns of Large Scale Image Convolutions on Parallel Architectures [O] . Sandra Dykes, Ra G. Dykes, Xiaodong Zhang, 1994

机译：并行体系结构上大规模图像卷积的通信和计算模式

Communication and computation patterns of large scale image convolutions on parallel architectures

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅