首页> 外文学位 >Compensation for Nonlinear Distortion in Noise for Robust Speech Recognition.

【24h】

Compensation for Nonlinear Distortion in Noise for Robust Speech Recognition.

机译：噪声中的非线性失真补偿，用于鲁棒的语音识别。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The performance, reliability, and ubiquity of automatic speech recognition systems has flourished in recent years due to steadily increasing computational power and technological innovations such as hidden Markov models, weighted finite-state transducers, and deep learning methods. One problem which plagues speech recognition systems, especially those that operate offline and have been trained on specific in-domain data, is the deleterious effect of noise on the accuracy of speech recognition. Historically, robust speech recognition research has focused on traditional noise types such as additive noise, linear filtering, and reverberation. This thesis describes the effects of nonlinear dynamic range compression on automatic speech recognition and develops a number of novel techniques for characterizing and counteracting it. Dynamic range compression is any function which reduces the dynamic range of an input signal. Dynamic range compression is a widely-used tool in audio engineering and is almost always a component of a practical telecommunications system. Despite its ubiquity, this thesis is the first work to comprehensively study and address the effect of dynamic range compression on speech recognition.;More specifically, this thesis treats the problem of dynamic range compression in three ways: (1) blind amplitude normalization methods, which counteract dynamic range compression when its parameter values allow the function to be mathematically inverted, (2) blind amplitude reconstruction techniques, i.e., declipping, which attempt to reconstruct clipped segments of the speech signal that are lost through non-invertible dynamic range compression, and (3) matched-training techniques, which attempt to select the pre-trained acoustic model with the closest set of compression parameters. All three of these methods rely on robust estimation of the dynamic range compression distortion parameters. Novel algorithms for the blind prediction of these parameters are also introduced. The algorithms' quality is evaluated in terms of the degree to which they decrease speech recognition word error rate, as well as in terms of the degree to which they increase a given speech signal's signal-to-noise ratio. In all evaluations, the possibility of independent additive noise following the application of dynamic range compression is assumed.

机译：近年来，由于稳步提高的计算能力和技术创新（例如隐马尔可夫模型，加权有限状态换能器和深度学习方法），自动语音识别系统的性能，可靠性和普遍性得到了蓬勃发展。困扰语音识别系统的问题，特别是那些离线操作并已在特定域内数据上训练过的系统，是噪声对语音识别准确性的有害影响。从历史上看，健壮的语音识别研究一直集中在传统噪声类型上，例如加性噪声，线性滤波和混响。本文描述了非线性动态范围压缩对自动语音识别的影响，并提出了许多表征和抵消它的新颖技术。动态范围压缩是任何可减小输入信号动态范围的功能。动态范围压缩是音频工程中广泛使用的工具，几乎始终是实用电信系统的组成部分。尽管它无处不在，但本论文还是首次全面研究和解决动态范围压缩对语音识别的影响的工作。更具体地说，本论文以三种方式处理动态范围压缩的问题：（1）盲幅归一化方法，（2）盲幅重构技术（即去噪），试图抵消由于非可逆动态范围压缩而丢失的语音信号的削波段，从而抵消其参数值允许函数进行数学求逆的动态范围压缩；（3）匹配训练技术，尝试选择具有最接近的压缩参数集的预训练声学模型。所有这三种方法都依赖于动态范围压缩失真参数的可靠估计。还介绍了用于这些参数的盲目预测的新颖算法。根据算法降低语音识别单词错误率的程度以及提高给定语音信号的信噪比的程度来评估算法的质量。在所有评估中，都假定在应用动态范围压缩后可能会产生独立的附加噪声。

著录项

作者
Harvilla, Mark J.;
展开▼
作者单位

Carnegie Mellon University.;

展开▼
授予单位 Carnegie Mellon University.;
学科 Engineering Electronics and Electrical.;Engineering Computer.;Computer Science.
学位 Ph.D.
年度 2014
页码 150 p.
总页数 150
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Nonlinear Compensation Using the Gauss–Newton Method for Noise-Robust Speech Recognition [J] . Zhao Y., Juang B.-H. Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第8期

机译：高斯-牛顿法进行非线性鲁棒语音识别的非线性补偿
2. Noise speech recognition based on robust features and a model-based noise compensation evaluated on aurora-2 task [J] . Kaisheng Yao, Jingdong Chen, Kuldip K. Paliwal, 電子情報通信学会技術研究報告. 音声. Speech . 2001,第522期

机译：基于Aurora-2任务评估的基于鲁棒功能的噪声语音识别和基于模型的噪声补偿
3. Noise speech recognition based on robust features and a model-based noise compensation evaluated on aurora-2 task [J] . Kaisheng Yao, Jingdong Chen, Kuldip K. Paliwal, 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2001,第520期

机译：基于Aurora-2任务评估的基于鲁棒功能的噪声语音识别和基于模型的噪声补偿
4. Model-based compensation of the additive noise for continuous speech recognition. Experiments using the AURORA II database and tasks [C] . J. C. Segura, A. de la Torre, M. C. Benitez, European conference on speech communication and technology . 2001

机译：基于模型的连续语音识别添加剂噪声补偿。使用Aurora II数据库和任务的实验
5. Compressive nonlinearity for representing speech spectral magnitude to improve noise robustness of automatic speech recognition . [D] . Wong, Brian. 2011

机译：压缩非线性表示语音频谱幅度提高语音自动识别的鲁棒性。
6. Speech-in-Noise Test results of compensation claimants for noise induced hearing loss in Korean male workers: Words-in-Noise Test (WIN) and quick-Hearing-in-Noise Test (HINT) [O] . Ji Soo Kim, Joong Keun Kwon, Nam Jeong Kim, 2021

机译：韩国男性工人噪声引起的噪声诱导损失的噪音索赔人的语音测试结果：单词 - 噪声测试（WIN）和快速听音 - 噪音测试（提示）
7. Use of Generalised Nonlinearity in Vector Taylor Series Noise Compensation for Robust Speech Recognition [O] . Loweimi, E., Barker, J., Hain, T. 2016

机译：用向量泰勒级数噪声补偿的广义非线性算法在鲁棒语音识别中的应用
8. Normalized Amplitude Modulation Features for Large Vocabulary Noise- Robust Speech Recognition. [R] . Mitra, V., Franco, H., Graciarena, M., 2012

机译：用于大词汇量噪声 - 鲁棒语音识别的归一化幅度调制特征。

Compensation for Nonlinear Distortion in Noise for Robust Speech Recognition.

摘要

著录项

相似文献

相关主题

期刊订阅