首页> 外文会议>Spoken Language Technology Workshop >Cascade RNN-Transducer: Syllable Based Streaming On-Device Mandarin Speech Recognition with a Syllable-To-Character Converter

【24h】

Cascade RNN-Transducer: Syllable Based Streaming On-Device Mandarin Speech Recognition with a Syllable-To-Character Converter

机译：Cascade RNN-Cransducer：基于音节的流式媒体，具有音节到字符转换器的语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

End-to-end models are favored in automatic speech recognition (ASR) because of its simplified system structure and superior performance. Among these models, recurrent neural network transducer (RNN-T) has achieved significant progress in streaming on-device speech recognition because of its high-accuracy and low-latency. RNN-T adopts a prediction network to enhance language information, but its language modeling ability is limited because it still needs paired speech-text data to train. Further strengthening the language modeling ability through extra text data, such as shallow fusion with an external language model, only brings a small performance gain. In view of the fact that Mandarin Chinese is a character-based language and each character is pronounced as a tonal syllable, this paper proposes a novel cascade RNN-T approach to improve the language modeling ability of RNN-T. Our approach firstly uses an RNN-T to transform acoustic feature into syllable sequence, and then converts the syllable sequence into character sequence through an RNN-T-based syllable-to-character converter. Thus a rich text repository can be easily used to strengthen the language model ability. By introducing several important tricks, the cascade RNN-T approach surpasses the character-based RNN-T by a large margin on several Mandarin test sets, with much higher recognition quality and similar latency.

机译：由于其简化的系统结构和卓越的性能，在自动语音识别（ASR）中有利于端到端模型。在这些模型中，经常性的神经网络传感器（RNN-T）由于其高精度和低延迟而在媒体上进行了媒体媒体识别。 RNN-T采用预测网络来增强语言信息，但其语言建模能力是有限的，因为它仍然需要将配对的语音文本数据进行训练。通过额外的文本数据进一步加强语言建模能力，例如具有外部语言模型的浅融合，只带来小的性能增益。鉴于普通话是一种基于角色的语言，每个角色都被发音为色调音节，本文提出了一种新的级联RNN-T方法来提高RNN-T的语言建模能力。我们的方法首先使用RNN-T将声学特征转换为音节序列，然后通过基于RNN-T的音节到字符转换器将音节序列转换为字符序列。因此，可以轻松地使用丰富的文本存储库来加强语言模型能力。通过引入几个重要的技巧，级联RNN-T方法在若干普通话测试集上超过了基于角色的RNN-T，具有更高的识别质量和类似的延迟。

著录项

来源
《Spoken Language Technology Workshop 》|2021年|15-21|共7页
会议地点
作者
Xiong Wang; Zhuoyuan Yao; Xian Shi; Lei Xie;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Transducers; Recurrent neural networks; Transforms; Predictive models; Performance gain; Data models; Acoustics;

机译：传感器;经常性神经网络;转换;预测模型;性能增益;数据模型;声学;

相似文献

外文文献
中文文献
专利

1. Improved syllable-based continuous Mandarin speech recognition using intersyllable boundary models [J] . Saga Chang, Sin-Horng Chen Electronics Letters . 1995 ,第11期

机译：使用音节间边界模型改进基于音节的连续普通话语音识别
2. Syllable language models for Mandarin speech recognition: Exploiting character language models [J] . Liu X., Hieronymus J.L., Gales M.J.F., The Journal of the Acoustical Society of America . 2013 ,第1期

机译：普通话语音识别的音节语言模型：利用字符语言模型
3. Towards Robustness to Speech Rate in Mandarin All-Syllable Recognition [J] . CHEN YiNing, ZHU Xuan, LIU Jia, Journal of Computer Science & Technology . 2003 ,第6期

机译：增强汉语全音节识别的语速
4. Attention Based On-Device Streaming Speech Recognition with Large Speech Corpus [C] . Kwangyoun Kim, Kyungmin Lee, Dhananjaya Gowda, IEEE Automatic Speech Recognition and Understanding Workshop . 2019

机译：基于注意的大型语音语料库基于设备的流式语音识别
5. Tonal syllable recognition for continuous Mandarin using phonetic models. [D] . Wu, Jiang. 2014

机译：使用语音模型对连续普通话进行音调识别。
6. The Binaural Masking-Level Difference of Mandarin Tone Detection and the Binaural Intelligibility-Level Difference of Mandarin Tone Recognition in the Presence of Speech-Spectrum Noise [O] . Cheng-Yu Ho, Pei-Chun Li, Yuan-Chuan Chiang, -1

机译：语音频谱噪声下普通话检测的双耳掩蔽水平差异和普通话识别的双耳可懂度水平差异
7. Attention Based On-Device Streaming Speech Recognition with Large Speech Corpus [O] . Kwangyoun Kim, Kyungmin Lee, Dhananjaya Gowda, 2019

机译：基于大语音语料库的设备流媒体语音识别

Cascade RNN-Transducer: Syllable Based Streaming On-Device Mandarin Speech Recognition with a Syllable-To-Character Converter

摘要

著录项

相似文献

相关主题

期刊订阅