...
首页> 外文期刊>IEEE Signal Processing Magazine >Vectorized transforms in scalar processors
【24h】

Vectorized transforms in scalar processors

机译:标量处理器中的向量化转换

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

We disclose a generalized approach to creating efficientnimplementations of linear, orthogonal transforms, with specific examplesndiscussed for the 8 x 8 DCT used in image compression. We connect thisnwith a method for performing signed, parallel processing in scalar,noff-the-shelf processors for integer transforms. Uniform data precisionnmay be used, but is not required for the method. The coefficientsnresulting from the new algorithm converge more quickly than thenapproximation made to the coefficients. Furthermore, the new algorithmnallows more control of the specific representation chosen for thencoefficients, as is detailed below. The methods described were designednfor addressing this need with two's-complement arithmetic. Data that cannbe processed in parallel, because of the algorithm structure, are packednin a "vector" format, described, into registers. Many signed arithmeticnoperations can be performed on these vectors, including addition,nsubtraction, multiplication by scalars, shifting, and others. When thenparallel processing is completed, the vectors can be unpacked intonscalar values for storage or subsequent processing. The importance ofnthese methods lies in their handling of carries and borrows in thenpacked vector format. The generalized method is described. Notation isngiven at the beginning to establish consistency through the article. Wendiscuss a generalized approach to integer transforms, using the DCT as anspecific example. Then we detail the vector format, which allows vectorncomputation in scalar processors of parallelizable algorithms. The IDCTnis used as a numerical example in the discussion of the vector format.nThe results were developed for high-end printers (e.g., more than 100npages per minute), where image compression and decompression must benperformed in real time, either in FPGAs, or in embedded processors;nhowever, the methods are applicable to a broad range of signalnprocessing systems
机译:我们公开了一种通用的方法来创建线性正交变换的有效实现,并针对图像压缩中使用的8 x 8 DCT讨论了具体示例。我们将此与用于在整数转换的标量,现成处理器中执行带符号并行处理的方法联系在一起。可以使用统一的数据精度,但该方法不是必需的。新算法产生的系数收敛速度比对系数的逼近速度快。此外,新算法允许对随后为系数选择的特定表示进行更多控制,如下所述。设计了所描述的方法,以通过二进制补码算法解决这一需求。由于算法结构的原因,无法并行处理的数据以描述的“矢量”格式打包到寄存器中。可以对这些向量执行许多有符号算术运算,包括加法,n减法,标量乘法,移位等。当并行处理完成时,可以解压缩矢量的intonscalar值以进行存储或后续处理。这些方法的重要性在于它们以打包的矢量格式处理进位和借位。描述了通用方法。一开始没有给出符号来建立本文的一致性。 Wendiscus以DCT为例,讨论了一种通用的整数转换方法。然后,我们详细介绍了矢量格式,该格式允许在可并行算法的标量处理器中进行矢量计算。 IDCTnis在矢量格式的讨论中用作数值示例。n结果是针对高端打印机(例如,每分钟超过100npages)开发的,在高端打印机中,必须在FPGA中实时执行图像压缩和解压缩。在嵌入式处理器中;但是,这些方法适用于广泛的信号处理系统

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号