首页> 外文期刊>Very Large Scale Integration (VLSI) Systems, IEEE Transactions on >A Generic and Scalable Architecture for a Large Acoustic Model and Large Vocabulary Speech Recognition Accelerator Using Logic on Memory
【24h】

A Generic and Scalable Architecture for a Large Acoustic Model and Large Vocabulary Speech Recognition Accelerator Using Logic on Memory

机译:基于内存逻辑的大型声学模型和大型词汇语音识别加速器的通用可扩展体系结构

获取原文
获取原文并翻译 | 示例
       

摘要

This paper describes a scalable hardware accelerator for speech recognition, which uses a two pass decoding algorithm with word dependent N-best Viterbi Beam Search. The observation probability calculation (Senone scoring) and first pass of decoding using a Bigram language model is implemented in hardware. The word lattice output from the first pass is used by software for the second pass, with a trigram language model. The proposed design uses a logic-on-memory approach to make use of high bandwidth nor flash memory to improve random read performance for Senone scoring and first pass decoding, both of which are memory intensive operations. The proposed HW/SW co-design achieves an overall speed up of 4.3X over a 2.4-GHz Intel Core 2 Duo processor running the CMU Sphinx speech recognition software, while consuming an estimated 1.72 W of power. The hardware accelerator provides improved speech recognition accuracy by supporting larger acoustic models and word dictionaries while maintaining real-time performance.
机译:本文介绍了一种用于语音识别的可扩展硬件加速器,该加速器使用带有单词相关N最佳维特比波束搜索的两遍解码算法。硬件中实现了观察概率计算(Senone评分)和使用Bigram语言模型进行解码的第一遍。软件将第一遍输出的单词点阵输出与Trigram语言模型一起用于第二遍。拟议的设计使用了一种基于内存的逻辑方法来利用高带宽或闪存来提高Senone评分和首过解码的随机读取性能,这两个都是内存密集型操作。与运行CMU Sphinx语音识别软件的2.4 GHz Intel Core 2 Duo处理器相比,拟议的硬件/软件协同设计可将整体速度提高4.3倍,同时消耗大约1.72 W的功率。硬件加速器通过支持更大的声学模型和单词词典,同时保持实时性能,提高了语音识别的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号