首页> 外文期刊>Very Large Scale Integration (VLSI) Systems, IEEE Transactions on >REC-STA: Reconfigurable and Efficient Chip Design With SMO-Based Training Accelerator
【24h】

REC-STA: Reconfigurable and Efficient Chip Design With SMO-Based Training Accelerator

机译:REC-STA:使用基于SMO的训练加速器进行可重新配置的高效芯片设​​计

获取原文
获取原文并翻译 | 示例

摘要

Sequential minimal optimization (SMO) and Karush–Kuhn–Tucker condition are often used to solve learning problems in support vector machines. However, during hardware implementation of the SMO algorithm, enhancing chip performance without excessively increasing chip area is often a crucial issue. The solution proposed in this paper is a novel reconfigurable and efficient chip design with SMO-based training accelerator (REC-STA). Two novel methods used in the proposed REC-STA are trimode coarse-grained reconfigurable architecture (TCRA) and triple finite-state-machine with dynamic scheduling The first method modifies the baseline SMO design by developing trimode reconfigurable architectures with parallel and pipeline computing capabilities. The second method provides a schedule for efficient reconfiguration of the TCRA. Use of these methods can remove kernel cache design. For chip manufacturing, the implementation of the REC-STA is synthesized, placed, and routed using the TSMC 0.18- $mu{rm m}$ technology library. The core size is 2.94 mm $times,$ 2.94 mm and the power consumption is 77.3 mW. Compared with the baseline design, the FPGA simulation results show that the proposed architecture requires 50% less memory and 31% fewer gate counts but provides a 16-fold improvement in training performance. The experimental results confirm the efficacy of the proposed architecture and methods.
机译:顺序最小优化(SMO)和Karush–Kuhn–Tucker条件通常用于解决支持向量机中的学习问题。但是,在SMO算法的硬件实现过程中,提高芯片性能而不过度增加芯片面积通常是一个关键问题。本文提出的解决方案是使用基于SMO的训练加速器(REC-STA)进行的新颖的可重构且高效的芯片设计。所提出的REC-STA中使用的两种新颖方法是三模粗粒度可重配置体系结构(TCRA)和具有动态调度的三重有限状态机。第一种方法是通过开发具有并行和流水线计算功能的三模可重构体系结构来修改基线SMO设计。第二种方法提供了有效配置TCRA的时间表。使用这些方法可以删除内核缓存设计。对于芯片制造,使用TSMC 0.18- $ mu {rm m} $ 技术库。核心尺寸为2.94 mm $ times,$ 2.94 mm,功耗为77.3 mW。与基线设计相比,FPGA仿真结果表明,所提出的架构所需的内存减少了50%,门数减少了31%,但训练性能却提高了16倍。实验结果证实了所提出的体系结构和方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号