首页> 外文会议>AES international conference >Extraction of Spectro-Temporal Speech Cues for Robust Automatic Speech Recognition

【24h】

Extraction of Spectro-Temporal Speech Cues for Robust Automatic Speech Recognition

机译：光谱-时间语音提示的提取，用于鲁棒的自动语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This work analyzes the use of spectro-temporal signal characteristics with the aim of improving the robustness of automatic speech recognition (ASR) systems. Experiments that aim at the robustness against extrinsic sources of variability (such as additive noise) as well as intrinsic variation of speech (changes in speaking rate, style, and effort) are presented. Results are compared to scores for the most common features in ASR (mel-frequency cepstral coefficients and perceptual linear prediction features), which account for the spectral properties of short-time segments of speech, but mostly neglect temporal or spectro-temporal cues. Intrinsic variations were found to severely degrade the overall ASR performance. The performance of the two most common feature types was degraded in much the same way, whereas the proposed spectro-temporal features exhibit a different sensitivity against intrinsic variations, which suggests that classic and spectro-temporal feature types carry complementary information. Furthermore, spectro-temporal features were shown to be more robust than the baseline system in the presence of additive noise.

机译：这项工作分析了频谱时态信号特性的使用，旨在提高自动语音识别（ASR）系统的鲁棒性。提出了旨在针对外部可变性（例如加性噪声）以及语音的固有变化（说话率，风格和努力程度的变化）的鲁棒性的实验。将结果与ASR中最常见特征（梅尔频率倒谱系数和感知线性预测特征）的分数进行比较，这些分数说明了短时语音片段的频谱特性，但大多忽略了时间或频谱时间提示。发现内在变化会严重降低整体ASR性能。两种最常见的特征类型的性能以几乎相同的方式降级，而建议的光谱时态特征对内在变化表现出不同的敏感性，这表明经典和光谱时特征类型具有互补信息。此外，在存在附加噪声的情况下，光谱时态特征显示比基线系统更健壮。

著录项

来源
《AES international conference》|2011年|p.108-117|共10页
会议地点
作者
Bernd T. Meyer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类声学工程;
关键词

相似文献

外文文献
中文文献
专利

1. Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition [J] . Marc René Sch?dler, Bernd T. Meyer, Birger Kollmeier The Journal of the Acoustical Society of America . 2012,第5期

机译：频谱时间调制子空间跨度滤波器组功能，用于强大的自动语音识别
2. Nonlinear spectro-temporal features based on a cochlear model for automatic speech recognition in a noisy situation [J] . ChoiY.-S., LeeS.-Y. Neural Networks: The Official Journal of the International Neural Network Society . 2013,第Null期

机译：基于人工耳蜗模型的非线性时空特征在嘈杂情况下的自动语音识别
3. Nonlinear spectro-temporal features based on a cochlear model for automatic speech recognition in a noisy situation [J] . ChoiY.-S., LeeS.-Y. Neural Networks: The Official Journal of the International Neural Network Society . 2013,第Null期

机译：基于嘈杂情况下自动语音识别的基于Cochlear模型的非线性光谱 - 时间特征
4. Extraction of Spectro-Temporal Speech Cues for Robust Automatic Speech Recognition [C] . Bernd T. Meyer Audio Engineering Society International Conference . 2011

机译：用于稳健自动语音识别的光谱 - 时间语音线索提取
5. Array-based Spectro-temporal Masking for Automatic Speech Recognition. [D] . Moghimi, Amir R. 2014

机译：基于阵列的频谱时域掩蔽，用于自动语音识别。
6. Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference [O] . Byeongwook Lee, Kwang-Hyun Cho -1

机译：以语音包络作为时间参考的自动语音识别的大脑启发式语音分割
7. Noise Robust Automatic Speech Recognition Based on Spectro-Temporal Techniques [O] . Kovács György 100

机译：基于光谱时态技术的噪声鲁棒自动语音识别
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Extraction of Spectro-Temporal Speech Cues for Robust Automatic Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅