首页> 外文OA文献 >Design Implementation of a Co-processor for Embedded, Real-Time, Speaker-Independent, Continuous Speech Recognition System-on-a-Chip
【2h】

Design Implementation of a Co-processor for Embedded, Real-Time, Speaker-Independent, Continuous Speech Recognition System-on-a-Chip

机译:嵌入式,实时,独立于说话者的连续片上语音识别系统协处理器的设计与实现

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This thesis aims to break the myth that multi-GHz machines are required for processing speaker-independent, continuous speech recognition based on full models performing full-precision computations in real-time. Through the design of a custom hardware architecture this research shows that 100 MHz is sufficient to process a 1,000 word dictionary in real-time. The design and implementation of the architecture is discussed in this thesis. It is shown that this implementation requires limited hardware resources and therefore can be incorporated as a dedicated speech recognition co-processor.The system comprises of three major blocks corresponding to Acoustic, Phonetic and Word Modeling. For maximum performance, each of the blocks has been implemented in a highly pipelined manner, thereby enabling the computation of several quantities simultaneously. Further, fewer computations implies lower power consumption. To achieve this, optimizations at every stage of the computations have been made by incorporating feedback which enables the computation of only active data at any given time instant. For ensuring a scalable implementation, a dynamic memory allocation scheme has also been incorporated which helps manage the internal memory.Amongst the three blocks, Acoustic Modeling contributes between 55-95% towards the overall computations performed by the system. Therefore special attention was paid onto the computations in Acoustic Modeling and a new computation reduction technique, bestN, is proposed. This technique addresses both the bandwidth requirement and the complexity of the computations. It is shown that for little loss in relative accuracy, only 8-bit integer micro-addition operations are required while traditional systems need numerous 32-bit multiply and add operations. This technique also helps address the bandwidth requirement of the system by requiring 1/8th the bandwidth of traditional methods, and for the same bus bandwidth, an 8x speedup in performance can be achieved.
机译:本文旨在打破一个神话,即基于实时执行高精度计算的完整模型,处理与说话者无关的连续语音识别需要使用多GHz机器。通过定制硬件体系结构的设计,这项研究表明100 MHz足以实时处理1,000个单词的词典。本文讨论了该体系结构的设计和实现。结果表明,该实现需要有限的硬件资源,因此可以作为专用的语音识别协处理器并入。该系统包括三个主要模块,分别对应于声学,语音和单词建模。为了获得最佳性能,每个块均以高度流水线方式实现,从而可以同时计算多个数量。此外,更少的计算意味着更低的功耗。为了实现这一点,已经通过结合反馈进行了计算的每个阶段的优化,该反馈使得能够在任何给定的时刻仅计算活动数据。为了确保可扩展的实现,还引入了动态内存分配方案,该方案有助于管理内部存储器。在这三个模块中,声学建模为系统执行的总体计算贡献了55-95%。因此,应特别注意声学建模中的计算,并提出了一种新的计算约简技术bestN。该技术解决了带宽需求和计算的复杂性。结果表明,相对精度的损失很小,仅需要8位整数微加运算,而传统系统则需要大量32位乘法和加法运算。通过要求传统方法带宽的1/8,该技术还有助于满足系统的带宽要求,对于相同的总线带宽,可以实现8倍的性能提升。

著录项

  • 作者

    Gupta Kshitij;

  • 作者单位
  • 年度 2006
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号