UNFOLD: A Memory-Efficient Speech Recognizer Using On-The-Fly WFST Composition

机译：展开：使用禁用WFST组成的记忆高效的语音识别器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Accurate, real-time Automatic Speech Recognition (ASR) requires huge memory storage and computational power. The main bottleneck in state-of-the-art ASR systems is the Viterbi search on a Weighted Finite State Transducer (WFST). The WFST is a graph-based model created by composing an Acoustic Model (AM) and a Language Model (LM) offline. Offline composition simplifies the implementation of a speech recognizer as only one WFST has to be searched. However, the size of the composed WFST is huge, typically larger than a Gigabyte, resulting in a large memory footprint and memory bandwidth requirements. In this paper, we take a completely different approach and propose a hardware accelerator for speech recognition that composes the AM and LM graphs on-the-fly. In our ASR system, the fully-composed WFST is never generated in main memory. On the contrary, only the subset required for decoding each input speech fragment is dynamically generated from the AM and LM models. In addition to the direct benefits of this on-the-fly composition, the resulting approach is more amenable to further reduction in storage requirements through compression techniques. The resulting accelerator, called UNFOLD, performs the decoding in real-time using the compressed AM and LM models, and reduces the size of the datasets from more than one Gigabyte to less than 40 Megabytes, which can be very important in small form factor mobile and wearable devices. Besides, UNFOLD improves energy-efficiency by orders of magnitude with respect to CPUs and GPUs. Compared to a state-of-the-art Viterbi search accelerators, the proposed ASR system outperforms by providing 31x reduction in memory footprint and 28% energy savings on average.

机译：准确的，实时自动语音识别（ASR）需要巨大的内存存储和计算能力。最先进的ASR系统中的主要瓶颈是加权有限状态传感器（WFST）上的Viterbi搜索。 WFST是通过组合声学模型（AM）和脱机语言模型（LM）创建的基于图形的模型。离线组成简化了语音识别器的实现，只有一个WFST必须搜索。然而，组合WFST的大小巨大，通常大于千兆字节，导致大的内存占用和存储器带宽要求。在本文中，我们采取了完全不同的方法，提出了一种用于语音识别的硬件加速器，用于在飞行中构成AM和LM图。在我们的ASR系统中，完全组成的WFST永远不会在主内存中生成。相反，仅从AM和LM模型动态生成解码每个输入语音片段所需的子集。除了这种在飞行组成的直接益处之外，通过压缩技术进一步降低储存要求，所得到的方法还更加可用。所产生的加速器称为展开，使用压缩的AM和LM模型实时执行解码，并将数据集的大小从多于一个千兆字节从小于40兆字节减少到小于40兆字节，这在小型移动中可能非常重要和可穿戴设备。此外，展开通过关于CPU和GPU的数量级来提高能量效率。与最先进的维特达搜索加速器相比，所提出的ASR系统通过提供31倍的内存占地面积和平均节能28％的节能来胜过。

著录项

来源
《International Symposium on Microarchitecture》|2017年|xix 825 p. :|共13页
会议地点
作者
Reza Yazdani; Jose-Maria Arnau; Antonio González;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP302-532;
关键词
speech recognition; transducers;

机译：语音识别;换能器;

相似文献

外文文献
中文文献
专利

1. Efficient WFST-Based One-Pass Decoding With On-The-Fly Hypothesis Rescoring in Extremely Large Vocabulary Continuous Speech Recognition [J] . Hori T., Hori C., Minami Y., IEEE transactions on audio, speech and language processing . 2007,第4期

机译：高效的基于WFST的单遍解码，具有即时假设，可极大地记录词汇量，并能连续语音识别
2. A 6 mW, 5,000-Word Real-Time Speech Recognizer Using WFST Models [J] . Price M., Glass J., Chandrakasan A.P. Solid-State Circuits, IEEE Journal of . 2015,第1期

机译：使用WFST模型的6 mW，5,000字实时语音识别器
3. WFST音声認識デコーダにおけるon-the-fly合成の最適化処理 [J] . 大西　翼, デイクソン　ポール, 岩野　公司, 電子情報通信学会論文誌 . 2009,第7期

机译：WFST语音识别解码器中的动态合成优化
4. UNFOLD: A Memory-Efficient Speech Recognizer Using On-The-Fly WFST Composition [C] . Reza Yazdani, Jose-Maria Arnau, Antonio González Annual IEEE/ACM International Symposium on Microarchitecture . 2017

机译：UNFOLD：使用即时WFST合成的内存有效语音识别器
5. Compositional verification using interface recognizers/suppliers (IRS). [D] . Jahanpour, Mohammad-Sadegh. 2001

机译：使用接口识别器/供应商（IRS）进行成分验证。
6. Recognizing visual speech: Reduced responses in visual-movement regions but not other speech regions in autism [O] . Kamila Borowiak, Stefanie Schelinski, Katharina von Kriegstein 2018

机译：识别视觉语音：视觉运动区域的反应减少但自闭症的其他语音区域却没有
7. 27.2 A 6mW 5K-Word Real-Time Speech Recognizer Using WFST Models [O] . Michael Price, James Glass, Anantha P. Ch 2014

机译：27.2使用WFsT模型的6mW 5K字实时语音识别器

UNFOLD: A Memory-Efficient Speech Recognizer Using On-The-Fly WFST Composition

摘要

著录项

相似文献

相关主题

期刊订阅