首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >A fast, efficient parallel-acting method of generating functions defined by power series, including logarithm, exponential, and sine, cosine
【24h】

A fast, efficient parallel-acting method of generating functions defined by power series, including logarithm, exponential, and sine, cosine

机译:一种快速,高效的并行运算方法,用于生成幂级数定义的函数,包括对数,指数和正弦,余弦

获取原文
获取原文并翻译 | 示例
           

摘要

A fundamental parallel procedure of implementing certain algorithms is by means of trees and arrays. A method of generating any function defined by a power series in a fast, efficient parallel-acting manner using trees and arrays is described. The power series considered can be written as f(Y)=a/sub 0/+a/sub 1/Y+a/sub 2/Y/sup 2/+...where Y=v/sub 1/x+V/sub 2/x/sup 2/+...+v/sub k/x/sup k/,v/sub i/=(0, 1), is a binary fraction when x=1/2. The power series must be expanded into individual terms cx/sup 1/. These terms are then transformed into weighted binary terms. Two methods are given to obtain all the individual terms (including coefficients) associated with each power of x. The hardware required for implementation is a tree similar to a Wallace or Dadda tree used for parallel multiplication of two binary numbers. Despite the multiplicity of terms required, Boolean logic methods reduce the tree dimensions in many cases so that the total tree required is smaller than an existing multiplier tree. In that case, Schwarz and Flynn (1993), have shown that the required tree can be superimposed on the existing multiplier tree in a multiplexed manner with relatively little increase in hardware. The generation of the logarithmic function is described in detail. Comparisons with other methods are made for the case of 11 bit accuracy of the logarithm. Using a figure of merit of latency times area (number of transistors), estimates show that the superposition scheme gives the best (smallest) figure of merit. For 11 bit accuracy, the superposition scheme requires only about 480 additional gates to be superimposed upon a 41 bit or larger multiplier, and the speed of operation is that of the multiplier.
机译:实现某些算法的基本并行过程是通过树和数组进行的。描述了一种使用树和阵列以快速,有效的并行作用方式生成由幂级数定义的任何函数的方法。所考虑的幂级数可以写为f(Y)= a / sub 0 / + a / sub 1 / Y + a / sub 2 / Y / sup 2 / + ...其中Y = v / sub 1 / x + V / sub 2 / x / sup 2 /+...+ v / sub k / x / sup k /,v / sub i / =(0,1)是x = 1/2时的二进制分数。幂级数必须扩展为单个项cx / sup 1 /。然后将这些项转换为加权二进制项。给出了两种方法来获得与x的每个幂相关联的所有单个项(包括系数)。实现所需的硬件是类似于用于两个二进制数的并行乘法的Wallace或Dadda树的树。尽管需要多个术语,但是布尔逻辑方法在许多情况下都减小了树的维数,因此所需的树总数小于现有的乘法器树。在那种情况下,Schwarz和Flynn(1993)表明,所需的树可以以多路复用的方式叠加在现有的乘法器树上,而硬件的增加却很少。详细描述对数函数的生成。对数精度为11位的情况下,与其他方法进行了比较。估计使用等待时间乘积面积(晶体管数量)的优值,表明叠加方案可提供最佳(最小)优值。对于11位精度,叠加方案仅需要将大约480个附加门叠加在41位或更大的乘法器上,运算速度就是该乘法器的速度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号