Research on Mongolian Speech Recognition Based on FSMN

机译：基于FSMN的蒙古语音识别研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep Neural Network (DNN) model has been achieved a significant result over the Mongolian speech recognition task, however, compared to Chinese, English or the others, there are still opportunities for further enhancements. This paper presents the first application of Feed-forward Sequential Memory Network (FSMN) for Mongolian speech recognition tasks to model long-term dependency in time series without using recurrent feedback. Furthermore, by modeling the speaker in the feature space, we extract the i-vector features and combine them with the Fbank features as the input to validate their effectiveness in Mongolian ASR tasks. Finally, discriminative training was firstly conducted over the FSMN by using maximum mutual information (MMI) and state-level minimum Bayes risk (sMBR), respectively. The experimental results show that: FSMN possesses better performance than DNN in the Mongolian ASR, and by using i-vector features combined with Fbank features as FSMN input and discriminative training, the word error rate (WER) is relatively reduced by 17.9% compared with the DNN baseline.

机译：深度神经网络（DNN）模型已经实现了蒙古语音识别任务的显着结果，然而，与中国英语或其他人相比，仍有进一步增强的机会。本文介绍了前源顺序存储网络（FSMN）对蒙古语音识别任务来模拟时间序列的长期依赖性而不使用反复反馈。此外，通过在特征空间中建模扬声器，我们提取I形式的功能，并将它们与FBANK功能组合为输入以验证蒙古ASR任务中的效力。最后，首先通过使用最大互信息（MMI）和状态级最小贝叶斯风险（SMBR）来在FSMN上进行鉴别培训。实验结果表明：FSMN在蒙古ASR中具有比DNN更好的性能，并且通过使用I - 载体特征与FBANK特征相结合，作为FSMN输入和鉴别训练，与...相比，错误率（WER）相对减少17.9％ DNN基线。

著录项

来源
《International Conference on Natural Language Processing and Chinese Computing》|2017年|966p|共12页
会议地点
作者
Yonghe Wang; Feilong Bao; Hongwei Zhang; Guanglai Gao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP312-53;
关键词
Mongolian; Speech recognition; DNN; FSMN; I-vector Sequence-criterion training;

机译：蒙古族;语音识别;DNN;FSMN;I - 矢量序列标准训练;

相似文献

外文文献
中文文献
专利

1. Combination of GMM-Based Speech Estimation Method and Temporal Domain SVD-Based Speech Enhancement for Noise Robust Speech Recognition [J] . Masakiyo Fujimoto, Yasuo Ariki Systems and Computers in Japan . 2007,第3期

机译：基于GMM的语音估计方法与基于时域SVD的语音增强相结合的噪声鲁棒语音识别
2. English Phrase Speech Recognition Based on Continuous Speech Recognition Algorithm and Word Tree Constraints [J] . Haifan Du, Haiwen Duan Complexity . 2021,第a期

机译：英语短语语音识别基于连续语音识别算法和字树约束
3. Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral and temporal modeling of sounds [J] . Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Computer speech and language . 2013,第3期

机译：客厅中的语音识别：基于声音的空间，频谱和时间建模的集成语音增强和识别系统
4. Research on Mongolian Speech Recognition Based on FSMN [C] . Yonghe Wang, Feilong Bao, Hongwei Zhang, Natural language understanding and intelligent applications . 2017

机译：基于FSMN的蒙古语语音识别研究
5. Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition. [D] . Zhang, Xianxian. 2005

机译：基于麦克风阵列，视听和帧选择的强大语音处理功能，可实现车载语音识别和内置说话人识别。
6. A Speech Recognition-based Solution for the Automatic Detection of Mild Cognitive Impairment from Spontaneous Speech [O] . László Tóth, Ildikó Hoffmann, Gábor Gosztolya, -1

机译：基于语音识别的自发性语音自动检测轻度认知障碍的解决方案
7. The Performance Evaluation of Continuous Speech Recognition Based on Korean Phonological Rules of Cloud-Based Speech Recognition Open API [O] . Hyun Jae Yoo, Sungwoong Seo, Sun Woo Im, 2021

机译：基于云的语音识别开放API韩语语音规则的连续语音识别性能评估
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Research on Mongolian Speech Recognition Based on FSMN

摘要

著录项

相似文献

相关主题

期刊订阅