首页> 外文期刊>IEICE transactions on information and systems >Fast AdaBoost-Based Face Detection System on a Dynamically Coarse Grain Reconfigurable Architecture
【24h】

Fast AdaBoost-Based Face Detection System on a Dynamically Coarse Grain Reconfigurable Architecture

机译:基于动态粗粒可重构体系结构的基于AdaBoost的快速人脸检测系统

获取原文
       

摘要

An AdaBoost-based face detection system is proposed, on a Coarse Grain Reconfigurable Architecture (CGRA) named “REMUS-II”. Our work is quite distinguished from previous ones in three aspects. First, a new hardware-software partition method is proposed and the whole face detection system is divided into several parallel tasks implemented on two Reconfigurable Processing Units (RPU) and one micro Processors Unit (μPU) according to their relationships. These tasks communicate with each other by a mailbox mechanism. Second, a strong classifier is treated as a smallest phase of the detection system, and every phase needs to be executed by these tasks in order. A phase of Haar classifier is dynamically mapped onto a Reconfigurable Cell Array (RCA) only when needed, and it's quite different from traditional Field Programmable Gate Array (FPGA) methods in which all the classifiers are fabricated statically. Third, optimized data and configuration word pre-fetch mechanisms are employed to improve the whole system performance. Implementation results show that our approach under 200MHz clock rate can process up-to 17 frames per second on VGA size images, and the detection rate is over 95%. Our system consumes 194mW, and the die size of fabricated chip is 23mm~(2) using TSMC 65nm standard cell based technology. To the best of our knowledge, this work is the first implementation of the cascade Haar classifier algorithm on a dynamically CGRA platform presented in the literature.
机译:在名为“ REMUS-II”的粗粮可重构架构(CGRA)上,提出了一种基于AdaBoost的人脸检测系统。我们的工作在三个方面与以前的工作截然不同。首先,提出了一种新的软硬件分区方法,并将整个人脸检测系统根据它们之间的关系分为在两个可重构处理单元(RPU)和一个微处理器单元(μPU)上实现的几个并行任务。这些任务通过邮箱机制相互通信。其次,强分类器被视为检测系统的最小阶段,每个阶段都需要依次执行这些任务。 Haar分类器的一个阶段仅在需要时才动态映射到可重配置单元阵列(RCA),它与传统的现场可编程门阵列(FPGA)方法不同,在传统的现场可编程门阵列(FPGA)方法中,所有分类器都是静态制造的。第三,采用优化的数据和配置字预取机制来提高整个系统的性能。实施结果表明,我们的方法在200MHz时钟频率下可以在VGA大小的图像上每秒处理多达17帧,并且检测率超过95%。我们的系统消耗194mW的功率,采用基于台积电65nm标准单元技术的芯片尺寸为23mm〜(2)。据我们所知,这项工作是文献中介绍的在动态CGRA平台上级联Haar分类器算法的首次实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号