首页> 外文期刊>Journal of Real-Time Image Processing >Real-time rate distortion-optimized image compression with region of interest on the ARM architecture for underwater robotics applications
【24h】

Real-time rate distortion-optimized image compression with region of interest on the ARM architecture for underwater robotics applications

机译:针对水下机器人应用的ARM架构上感兴趣区域的实时速率失真优化图像压缩

获取原文
获取原文并翻译 | 示例

摘要

This paper proposes the use of a real-time progressive image compression and region of interest algorithm for the ARM processor architecture. This algorithm is used for the design of an underwater image sensor for an autonomous underwater vehicle for intervention, under a highly constrained available bandwidth scenario, allowing for a more agile data exchange between the vehicle and a human operator supervising the underwater intervention. For high compression ratios (smaller output size), execution time is dominated by the transformation algorithm, which plays a progressively smaller role as the compression ratio gets smaller (larger output size). A novel progressive rate distortion-optimized image compression algorithm based on the discrete wavelet transform (DWT) is presented, with special emphasis on a novel minimal time parallel DWT algorithm, which allows full memory bandwidth saturation using only a few cores of a modern multicore embedded processor. The paper focuses in a novel efficient inplace, multithreaded, and cache-friendly parallel 2-D wavelet transform algorithm, based on the lifting transform using the ARM Architecture. In order to maximize the cache utilization and consequently minimize the memory bus bandwidth use, the threads compete to work on a small memory area, maximizing the chances of finding the data in the cache. Their synchronization is done with very low overhead, without the use of any locks and relying solely on the basic compare-and-swap atomic primitive. An implementation in C programming language with and without the use of vector instructions (single instruction multiple data) is provided for both, single (serial) and multi-(parallel) threaded single-loop DWT implementations, as well as serial and parallel naive implementations using linear (row order) and strided (column order) memory access patterns for comparison. Results show a significant improvement over the single-threaded optimized implementation and a much greater improvement over both, the single- and multi-threaded naive implementations, reaching minimal running time depending on the memory access pattern, the number of processor cores, and the available memory bus bandwidth, i.e., it becomes memory bound using the minimum number of memory accesses. Due to memory saturation, the inplace 2-D DWT transform can be executed in the same time as a 1-D DWT transform or as an inplace memory block copy.
机译:本文提出了一种针对ARM处理器体系结构的实时渐进图像压缩和感兴趣区域算法。该算法用于设计用于自主水下航行器的水下图像传感器,以便在高度受限的可用带宽情况下进行干预,从而允许车辆与监督水下干预的操作员之间进行更灵活的数据交换。对于高压缩比(较小的输出大小),执行时间由转换算法控制,随着压缩率变得较小(较大的输出大小),转换算法的作用逐渐减小。提出了一种基于离散小波变换(DWT)的渐进速率失真优化的图像压缩算法,特别强调了一种新颖的最小时间并行DWT算法,该算法仅使用现代多核嵌入式系统的几个核就可以实现全部内存带宽饱和处理器。本文基于使用ARM体系结构的提升变换,着重研究了一种新颖的高效,就地,多线程和缓存友好的并行二维小波变换算法。为了最大化高速缓存利用率并因此最小化内存总线带宽的使用,线程竞争在较小的内存区域上工作,从而最大化了在高速缓存中查找数据的机会。它们的同步以非常低的开销完成,而无需使用任何锁,并且仅依赖于基本的“比较和交换”原子原语。为单(串行)线程和多(并行)线程单循环DWT实现以及串行和并行朴素实现提供了使用C编程语言的实现,带有和不带有向量指令(单指令多个数据)的实现使用线性(行顺序)和跨步(列顺序)内存访问模式进行比较。结果显示,与单线程优化实现相比,有了显着改进,而与单线程和多线程朴素实现相比,则有了更大的改进,根据内存访问模式,处理器内核数量和可用内存的不同,运行时间最短。内存总线带宽,即,它使用最少的内存访问次数成为内存绑定。由于内存饱和,原位2-D DWT转换可以与1-D DWT转换或原位存储块副本同时执行。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号