Harnessing graphics processors for the fast computation of acoustic likelihoods in speech recognition

Paul R. Dixon; Tasuku Oonishi; Sadaoki Furui

首页> 外文期刊>Computer speech and language >Harnessing graphics processors for the fast computation of acoustic likelihoods in speech recognition

【24h】

Harnessing graphics processors for the fast computation of acoustic likelihoods in speech recognition

机译：利用图形处理器在语音识别中快速计算声学似然

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In large vocabulary continuous speech recognition (LVCSR) the acoustic model computations often account for the largest processing overhead. Our weighted finite state transducer (WFST) based decoding engine can utilize a commodity graphics processing unit (GPU) to perform the acoustic computations to move this burden off the main processor. In this paper we describe our new GPU scheme that can achieve a very substantial improvement in recognition speed whilst incurring no reduction in recognition accuracy. We evaluate the GPU technique on a large vocabulary spontaneous • speech recognition task using a set of acoustic models with varying complexity and the results consistently show by using the GPU it is possible to reduce the recognition time with largest improvements occurring in systems with large numbers of Gaussians. For the systems which achieve the best accuracy we obtained between 2.5 and 3 times speed-ups. The faster decoding times translate to reductions in space, power and hardware costs by only requiring standard hardware that is already widely installed.

机译：在大词汇量连续语音识别（LVCSR）中，声学模型计算通常会占用最大的处理开销。我们基于加权有限状态换能器（WFST）的解码引擎可以利用商品图形处理单元（GPU）进行声学计算，从而将负担减轻到主处理器之外。在本文中，我们描述了我们的新GPU方案，该方案可以在不降低识别精度的情况下实现识别速度的极大提高。我们使用一组具有不同复杂度的声学模型，对大型词汇自发•语音识别任务进行了GPU技术评估，结果一致显示，使用GPU可以减少识别时间，并且在具有大量语音的系统中发生的最大改进高斯人。对于达到最佳精度的系统，我们获得了2.5至3倍的加速比。更快的解码时间只需要已经广泛安装的标准硬件，就可以减少空间，功耗和硬件成本。

著录项

来源
《Computer speech and language》 |2009年第4期|510-526|共17页
作者
Paul R. Dixon; Tasuku Oonishi; Sadaoki Furui;
展开▼
作者单位

Department of Computer Science, Tokyo Institute of Technology, 2-12-1, Ookayama, Meguro-ku, Tokyo 152-8552, Japan;

Department of Computer Science, Tokyo Institute of Technology, 2-12-1, Ookayama, Meguro-ku, Tokyo 152-8552, Japan;

Department of Computer Science, Tokyo Institute of Technology, 2-12-1, Ookayama, Meguro-ku, Tokyo 152-8552, Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
LVCSR; GPGPU; novel hardware for ASR; WFST;

机译：LVCSR;GPGPU;用于ASR的新颖硬件;科学技术基金会;

相似文献

外文文献
中文文献
专利

1. Optimized Acoustic Likelihoods Computation for NVIDIA and ATI/AMD Graphics Processors [J] . Vanek J. Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第6期

机译：NVIDIA和ATI / AMD图形处理器的优化声学似然计算
2. Fast Likelihood Computation in Speech Recognition using Matrices [J] . Mrugesh R. Gajjar, T. V. Sreenivas, R. Govindarajan Journal of Signal Processing Systems . 2013,第2期

机译：使用矩阵的语音识别中的快速似然计算
3. Fast Likelihood Computation in Speech Recognition using Matrices [J] . Mrugesh R. Gajjar, T. V. Sreenivas, R. Govindarajan Journal of signal processing systems for signal, image, and video technology . 2013,第2期

机译：使用矩阵的语音识别中的快速似然计算
4. Design of speech recognition co-processor with fast gaussian likelihood computation [C] . Li Li, Liang Weiqian, Hui Geng 2011 3rd International Conference on Computer Research and Development . 2011

机译：快速高斯似然计算的语音识别协处理器设计
5. Harnessing the power of graphics processing units to accelerate computational chemistry. [D] . Miao, Yipu. 2015

机译：利用图形处理单元的功能来加速计算化学过程。
6. Mapping the Information Trace in Local Field Potentials by a Computational Method of Two-Dimensional Time-Shifting Synchronization Likelihood Based on Graphic Processing Unit Acceleration [O] . Zi-Fang Zhao, Xue-Zhu Li, You Wan 2017

机译：基于图形处理单元加速的二维时移同步似然计算方法绘制局部场势中的信息迹线
7. Optimized acoustic likelihoods computation for NVIDIA and ATI/AMD graphics processors [O] . Vaněk, Jan, Trmal, Jan, Psutka, Josef V., 2012

机译：为NVIDIA和ATI / AMD图形处理器优化了声学似然计算
8. Acoustical Pre-Processing for Robust Speech Recognition. [R] . Stern, R. M., Acero, A. 1989

机译：用于鲁棒语音识别的声学预处理。

Harnessing graphics processors for the fast computation of acoustic likelihoods in speech recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅