首页> 外文期刊>IEEE transactions on circuits and systems . I , Regular papers >A Fast and Power-Efficient Hardware Architecture for Visual Feature Detection in Affine-SIFT
【24h】

A Fast and Power-Efficient Hardware Architecture for Visual Feature Detection in Affine-SIFT

机译:Affine-SIFT中用于视觉特征检测的快速高效的硬件架构

获取原文
获取原文并翻译 | 示例

摘要

Visual feature detection has been widely used in many computer vision applications, with increasing concern on feature robustness, processing speed, and power efficiency. In comparison with popular feature detection algorithms, affine-SIFT achieves the strongest robustness on the image illumination, image rotation, and image scale transformation, but exhibits extreme high computation complexity. To improve its computing efficiency, this work first proposes three hardware optimization methods to address three main performance bottlenecks. The first method is the reverse affine-based pipelined computing with optimized memory accessing. The second method is about stream processing with full parallel Gaussian pyramid. The third method is the rotation invariant binary pattern based feature vector generation. Then by incorporating these three optimization methods, this paper designs a high-efficient pipelined and parallel hardware architecture with optimized parallel memory accessing. Postlayout simulations using TSMC 65-nm 1P9M low power process show that this work achieves a processing speed of 97 fps at 1080p (1000 feature points per frame on average) under 200 MHz, with power consumption at 300 mW. In comparison, its computing efficiency (1005.6K pixels/s at 1 MHz) and power efficiency (670.5K pixels/s at 1 mW) are higher than state-of-the-art works and it is more promising for broad vision applications especially the embedded vision and mobile vision applications.
机译:视觉特征检测已广泛用于许多计算机视觉应用中,越来越关注特征的鲁棒性,处理速度和能效。与流行的特征检测算法相比,仿射SIFT在图像照明,图像旋转和图像比例转换方面实现了最强的鲁棒性,但显示出极高的计算复杂度。为了提高其计算效率,这项工作首先提出了三种硬件优化方法来解决三个主要的性能瓶颈。第一种方法是基于逆仿射的流水线计算,具有优化的内存访问。第二种方法是使用完全平行的高斯金字塔进行流处理。第三种方法是基于旋转不变二进制模式的特征向量生成。然后,通过结合这三种优化方法,本文设计了具有优化的并行存储器访问的高效流水线和并行硬件体系结构。使用台积电65纳米1P9M低功耗工艺进行的布局后仿真显示,这项工作在200 MHz下以1080p(平均每帧1000个特征点)的速度实现了97 fps的处理速度,功耗为300 mW。相比之下,其计算效率(在1 MHz时为1005.6K像素/ s)和功率效率(在1 mW时为670.5K像素/ s)要比最新技术要高,特别是对于宽视场应用而言,它更具前景嵌入式视觉和移动视觉应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号