首页> 外文会议>SPIE Conference on Visual Information Processing and Communication >A Sliced Synchronous Iteration Architecture for Real-Time Global Stereo Matching
【24h】

A Sliced Synchronous Iteration Architecture for Real-Time Global Stereo Matching

机译:用于实时全局立体声匹配的切片同步迭代架构

获取原文

摘要

In this paper, we present a low memory-cost message iteration architecture for a fast belief propagation(BP) algorithm.To meet the real-time goal, our architecture basically follows multi-scale BP method and truncated linear smoothness cost model. We observe that the message iteration process in BP requires a huge intermediate buffer to store four directional messages of the whole node. Therefore, instead of updating all the node messages in each iteration sequence,we propose that individual node could be completed iteration process in ahead and consecutively execute it node by node. The key ideas in this paper focus on both maximizing architecture's parallelism and minimizing implementation cost overhead. Therefore, we first apply a pipelined architecture to each iteration stage that is executed independently. Note that pipelining makes it faster message throughput at a single iteration cycle rather than consuming whole iteration cycle time as previously. We also make multiple message update nodes-as a minimal processing unit to maximize the parallelism. For the multi-scale BP method, the proposed parallel architecture does not cause additional execution time for processing the nodes in the down-scaled Markov Random Field(MRF). Considering VGA image size, 4 iterations per each scale and 64 disparity levels, our approach- can-reduce memory complexity by 99.7% and make it 340 times faster than the general multi-scale BP architecture.
机译:在本文中,我们为快速信念传播(BP)算法提供了一个低内存成本的消息迭代架构。要满足实时目标,我们的架构基本上遵循多尺度BP方法和截断的线性平滑度成本模型。我们观察到BP中的消息迭代过程需要一个庞大的中间缓冲区来存储整个节点的四个方向消息。因此,不是在每个迭代序列中更新所有节点消息,而是建议在前面的迭代过程中可以完成迭代过程,并通过节点连续执行IT节点。本文中的关键思想侧重于最大化架构的并行性和最小化实现成本开销。因此,我们首先将流水线架构应用于独立执行的每个迭代阶段。请注意,Pipelining在单个迭代周期中更快地使消息吞吐量更快,而不是以前消耗整个迭代周期时间。我们还制作多个消息更新节点 - 作为最小处理单元以最大化并行性。对于多尺度BP方法,所提出的并行架构不会导致处理下方马尔可夫随机字段(MRF)中的节点的额外执行时间。考虑到VGA图像尺寸,每种刻度4个迭代,64个差距水平,我们的方法可以降低内存复杂度99.7%,使其比一般多尺度BP架构快340倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号