The accumulation operation A_(new) = A_(old) + X is required for many numerical methods. However, when using a floating-point adder with pipeline latency α, the data hazard that exists between A_(new) and A_(old) creates design challenges for situations where inputs must be delivered to the accumulator at a rate exceeding 1/α. Each of the techniques proposed to address this problem requires either static data scheduling or overly complex micro-architectures having multiple adders, a large amount of memory, or control overheads that force the accumulator to operate at a diminished speed relative to the adder on which it is based. In this paper we present a design for a double precision accumulator that achieves high performance without the need for data scheduling or an overly complex implementation. We achieve this by integrating a coalescing reduction circuit within the low-level design of a base-converting floating-point adder. When implemented on our Virtex-2 Pro 100 FPGA, our design achieves a speed of 170 MHz.
展开▼
机译:许多数字方法需要累积操作A_(新)= A_(旧)+ X.然而,当使用具有管道延迟的浮点加法器α时,A_(新)和A_(旧)之间存在的数据危险为输入必须以超过1 /α的速率传送到累加器的情况而产生的设计挑战。提出解决此问题的每个技术需要具有多个加法器的静态数据调度或过度复杂的微架构,大量存储器或控制累加累加器以相对于加法器以减少速度操作的开销。基于。在本文中,我们为双重精密蓄能器提供了一种设计,该累加器实现了高性能,而无需数据调度或过度复杂的实现。我们通过在基础转换浮点加法器的低级设计内集成聚结的减速电路来实现这一点。在我们的Virtex-2 Pro 100 FPGA上实施时,我们的设计达到了170 MHz的速度。
展开▼