Investigating Manifold Learning Technique for Robust Speech Recognition

机译：研究流形学习技术以实现可靠的语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Developing robustness methods is imperative to retaining good performance for automatic speech recognition (ASR)systems when being confronted with different environmental noise or channel distortion. Previous studies have pointed out that exploration of low-dimensional structures of speech features is beneficial to generating robust features so as to enhance ASR performance. Along this research direction, we argue that the intrinsic structures of speech features lying on a manifold subspace of low dimensionality residing in their original ambient space of high dimensionality. This way, noise components can be ruled out by projecting noisy speech features into the pre-learned subspace of manifold structures. This paper explores the intrinsic geometric low-dimensional manifold structures inherent speech features' modulation spectra, with the goal to generate speech features that are more robust to environmental noise and channel distortion. The key novelty of our work is two-fold: 1)we put forward an innovative use of the graph-regularization based method to generate robust speech features by preserving the inherent manifold structures of modulation spectra and excluding irrelevant ones, and 2)we also compare our approach with several mainstream methods that also explores low-dimensional structures of data instances with in-depth analysis. A comprehensive set of empirical experiments carried out on an ASR benchmark task seem to reveal the superior performance of our proposed methods.

机译：开发稳健性方法必须在面对不同的环境噪声或通道失真时保持自动语音识别（ASR）系统的良好性能。以前的研究表明，言论语音特征的低维结构的探讨是有利于产生鲁棒特征，以提高ASR性能。沿着这项研究方向，我们认为，言论的内在结构呈现在高维度的低维品的歧管子空间上。这样，可以通过将嘈杂的语音特征投影到歧管结构的预先学习子空间中来排除噪声分量。本文探讨了内在的几何低维歧管结构固有的语音特征“调制光谱，其目标是生成对环境噪声和信道失真更强大的语音特征。我们工作的关键新颖性是两倍：1）我们提出了一种创新的基于图形规范化的方法，通过保留调制光谱的固有歧管结构来产生强大的语音特征，并排除不相关的结构，以及2）我们也是将我们的方法与多个主流方法进行比较，该方法还探讨了具有深入分析的数据实例的低维结构。在ASR基准任务中进行了一套全面的实证实验，似乎揭示了我们所提出的方法的卓越性能。

著录项

来源
《International conference on Asian language processing》|2018年|68-73|共6页
会议地点
作者
Bi-Cheng Yan; Chin-Hong Shih; Berlin Chen; Shih-Hung Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Modulation; Manifolds; Robustness; Noise measurement; Sparse matrices; Dictionaries; Matching pursuit algorithms;

机译：调制;流形;鲁棒性;噪声测量;稀疏矩阵;字典;匹配追踪算法;

相似文献

外文文献
中文文献
专利

1. Noise robust speech recognition system using multimodal audio-visual approach using different deep learning classification techniques [J] . Eslam E. El Maghraby, Amr M. Gody, Mohamed Hesham Farouk International Journal of Advanced Computer Research . 2020,第47期

机译：利用不同深度学习分类技术，使用多模式视听方法的噪声强大语音识别系统
2. Speech Features Extraction Techniques for Robust Emotional Speech Analysis/Recognition [J] . K. M. Shiva Prasad, G. N. Kodanda Ramaiah, M. B. Manjunatha Indian Journal of Science and Technology . 2017,第3期

机译：语音特征提取技术，用于健壮的情感语音分析/识别
3. An End-to-End Deep Learning Approach to Simultaneous Speech Dereverberation and Acoustic Modeling for Robust Speech Recognition [J] . Bo Wu, Kehuang Li, Fengpei Ge, Selected Topics in Signal Processing, IEEE Journal of . 2017,第8期

机译：端到端深度学习方法可同时进行语音去混响和声学建模，以实现可靠的语音识别
4. Investigating Manifold Learning Technique for Robust Speech Recognition [C] . Bi-Cheng Yan, Chin-Hong Shih, Berlin Chen, International Conference on Asian Language Processing . 2018

机译：强制性语音识别研究歧管学习技术
5. Robust Recognition of Binaural Speech Signals Using Techniques Based on Human Auditory Processing [D] . Menon, Anjali I. 2019

机译：基于人类听觉处理技术的双耳语音信号的稳健识别
6. Deep Learning Techniques for Speech Emotion Recognition from Databases to Models [O] . Babak Joze Abbaschian, Daniel Sierra-Sosa, Adel Elmaghraby 2021

机译：语音情感认可的深度学习技术从数据库到模型
7. Noise-Robust Speech Recognition System based on Multimodal Audio-Visual Approach Using Different Deep Learning Classification Techniques [O] . Eslam ElMaghraby, Amr Gody, Mohamed Farouk 2020

机译：基于不同深度学习分类技术的多模式视听方法的噪声鲁棒语音识别系统

Investigating Manifold Learning Technique for Robust Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅