Feature Compensation Techniques for ASR on Band-Limited Speech

Morales N.Toledano D.T.Hansen J.H.L.Garrido J.

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Feature Compensation Techniques for ASR on Band-Limited Speech

【24h】

Feature Compensation Techniques for ASR on Band-Limited Speech

机译：带限语音的ASR特征补偿技术

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Band-limited speech (speech for which parts of the spectrum are completely lost) is a major cause for accuracy degradation of automatic speech recognition (ASR) systems particularly when acoustic models have been trained with data with a different spectral range. In this paper, we present an extensive study of the problem of ASR of band-limited speech with full-bandwidth acoustic models. Our focus is mainly on band-limited feature compensation, covering even the case of time-varying band-limiting distortions, but we also compare this approach to more common model-side techniques (adaptation and retraining) and explore the combination of feature-based and model-side approaches. The feature compensation algorithms proposed are organized in a unified framework supported by a novel mathematical model of the impact of such distortions on Mel-frequency cepstral coefficient (MFCC) features. A crucial and novel contribution is the analysis made of the relative correlation of different elements in the MFCC feature vector for the cases of full-bandwidth and limited-bandwidth speech, which justifies an important modification in the feature compensation scheme. Furthermore, an intensive experimental analysis is provided. Experiments are conducted on real telephone channels, as well as artificial low-pass and bandpass filters applied over TIMIT data, and results are given for different experimental constraints and variations of the feature compensation method. Results for other well-known robustness approaches, such as cepstral mean normalization (CMN), model retraining, and model adaptation are also given for comparison. ASR performance with our approach is similar or even better than model adaptation, and we argue that in particular cases such as rapidly varying distortions, or limited computational or memory resources, feature compensation is more convenient. Furthermore, we show that feature-side and model-side approaches may be combined, outperforming any of those approache-n-ns alone.

机译：频带受限的语音（部分频谱完全丢失的语音）是自动语音识别（ASR）系统精度下降的主要原因，尤其是当声学模型已经使用不同频谱范围的数据进行训练时。在本文中，我们对具有全带宽声学模型的带限语音的ASR问题进行了广泛的研究。我们的重点主要放在带限特征补偿上，甚至涵盖了随时间变化的带限失真情况，但我们还将这种方法与更常见的模型端技术（自适应和再训练）进行了比较，并探索了基于特征的组合和模型方面的方法。提出的特征补偿算法在统一的框架中进行组织，并由新颖的数学模型支持，这些模型对梅尔频率倒谱系数（MFCC）特征产生了影响。对全带宽和有限带宽语音情况下的MFCC特征向量中不同元素的相对相关性进行分析，是一项至关重要的新颖贡献，这证明了对特征补偿方案进行重要修改的合理性。此外，提供了深入的实验分析。在真实的电话信道上进行了实验，并对TIMIT数据应用了人工的低通和带通滤波器，并针对不同的实验约束和特征补偿方法的变化给出了结果。还提供了其他众所周知的鲁棒性方法的结果，例如倒谱均值归一化（CMN），模型再训练和模型自适应。我们的方法的ASR性能与模型适应性相似甚至更好，并且我们认为在特定情况下，例如快速变化的失真或有限的计算或内存资源，特征补偿更加方便。此外，我们表明，可以结合使用特征方方法和模型方方法，胜过任何单独的方法。

著录项

来源
《Audio, Speech, and Language Processing, IEEE Transactions on》 |2009年第4期|p.758-774|共17页
作者
Morales N.Toledano D.T.Hansen J.H.L.Garrido J.;
展开▼
作者单位

Nuance Commun. GmbH, Aachen;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Automatic speech recognition (ASR); feature compensation; restricted communications bandwidth channels; robustness;

机译：自动语音识别（ASR）;功能补偿;受限的通信带宽通道;稳健性;

相似文献

外文文献
中文文献
专利

1. Blind Feature Compensation for Time-Variant Band-Limited Speech Recognition [J] . Morales N., Toledano D.T., Hansen J.H.L., IEEE signal processing letters . 2007,第1期

机译：时变带限语音识别的盲特征补偿
2. Blind Feature Compensation for Time-Variant Band-Limited Speech Recognition [J] . Morales N., Toledano D. T., Hansen J. H. L., IEEE signal processing letters . 2007,第期

机译：时变带限语音识别的盲特征补偿
3. Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments [J] . Hong Kook Kim, Rose R.C. IEEE Transactions on Speech and Audio Proceessing . 2003,第5期

机译：在嘈杂环境中基于语音和噪声分解的倒谱域声学特征补偿
4. Compensation of partly reliable components for band-limited speech recognition with missing data techniques [C] . He Yongjun, Han Jiqing, Zheng Tieran, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing . 2011

机译：使用丢失的数据技术补偿部分可靠的组件，用于带限语音识别
5. A multimodal fusion approach for automatic postal address recognition system using Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR) techniques. [D] . Singh, Amriteshwar. 2011

机译：一种使用光学字符识别（OCR）和自动语音识别（ASR）技术的自动邮政地址识别系统的多模式融合方法。
6. Diagnostic Assessment of Childhood Apraxia of Speech Using Automatic Speech Recognition (ASR) Methods [O] . John-Paul Hosom, Lawrence Shriberg, Jordan R. Green -1

机译：使用自动语音识别（ASR）方法对儿童言语失用症的诊断评估
7. Multivariate cepstral feature compensation on band-limited data for robust speech recognition [O] . Morales Mombiela, Nicolás, Toledano, Doroteo T., Hansen, John H. L., 2007

机译：用于强健语音识别的带限数据的多变量倒频谱特征补偿

Feature Compensation Techniques for ASR on Band-Limited Speech

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅