首页> 外文期刊>ACM transactions on reconfigurable technology and systems >ARC 2014: Towards a Fast FPGA Implementation of a Heap-Based Priority Queue for Image Coding Using a Parallel Index-Aware Tree
【24h】

ARC 2014: Towards a Fast FPGA Implementation of a Heap-Based Priority Queue for Image Coding Using a Parallel Index-Aware Tree

机译:ARC 2014:使用并行索引感知树实现基于堆的优先级队列的快速FPGA实现图像编码

获取原文
获取原文并翻译 | 示例

摘要

The embedded image processing systems like smartphones and digital cameras have tight limits on storage, computation power, network connectivity, and battery usage. These limitations make it important to ensure efficient image coding. In the article, we present a novel heap-based priority queue structure employed by an Adaptive Scanning of Wavelet Data scheme (ASWD) targeting an embedded platform. ASWD is a context modeling block implemented via priority queues in a wavelet-based image coder to reorganize the wavelet coefficients into locally stationary sequences. The architecture we propose exploits efficient use of FPGA's on-chip dual-port memories in an adaptive manner. Innovations of index-aware system linked to each element in the queue makes the location of queue element traceable in the heap as per the requirements of the ASWD algorithm. Moreover, use of 4-port memories along with intelligent data concatenation of queue elements yielded in a cost effective enhanced memory access. The memory ports are adaptively assigned to different units during different processing phases in a manner to optimally take advantage of memory access required by that phase. The architectural innovations can also be exploited in other applications that require efficient hardware implementations of generic priority queue or classical sorting applications which sort into the index. We designed and validated the hardware on an Altera's Stratix IV FPGA as an IP accelerator in a Nios II processor based System on Chip. We show that our architecture at 150MHz can provide 45X speedup compared to an embedded ARM Cortex-A9 processor at 666MHz targeting the throughput of 10MB/s.
机译:智能手机和数码相机等嵌入式图像处理系统对存储,计算能力,网络连接性和电池使用量都有严格的限制。这些限制使得确保有效的图像编码变得很重要。在本文中,我们提出了一种针对嵌入式平台的,基于小波数据自适应扫描方案(ASWD)的新颖的基于堆的优先级队列结构。 ASWD是一个上下文建模模块,通过基于小波的图像编码器中的优先级队列来实现,以将小波系数重新组织为局部固定序列。我们提出的架构利用自适应方式有效利用了FPGA的片上双端口存储器。链接到队列中每个元素的索引感知系统的创新使按ASWD算法的要求可在堆中跟踪队列元素的位置。此外,使用4端口存储器以及队列元素的智能数据串联可以提高成本效益,提高存储器访问效率。存储器端口在不同处理阶段期间以最佳利用该阶段所需的存储器访问的方式自适应地分配给不同单元。架构创新也可以在需要通用优先级队列的有效硬件实现的其他应用程序或分类到索引的经典排序应用程序中使用。我们设计并验证了Altera Stratix IV FPGA上的硬件,作为基于Nios II处理器的片上系统中的IP加速器。我们证明,与以666MHz为目标,吞吐量为10MB / s的嵌入式ARM Cortex-A9处理器相比,我们在150MHz的体系结构可以提供45倍的速度提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号