首页> 外文学位 >Synergy of acoustic-phonetics and auditory modeling towards robust speech recognition.

【24h】

Synergy of acoustic-phonetics and auditory modeling towards robust speech recognition.

机译：语音和听觉建模对强大语音识别的协同作用。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The problem addressed in this work is that of enhancing speech signals corrupted by additive noise and improving the performance of automatic speech recognizers in noisy conditions. The enhanced speech signals can also improve the intelligibility of speech in noisy conditions for human listeners with hearing impairment as well as for normal listeners.;The original Phase Opponency (PO) model, proposed to detect tones in noise, simulates the processing of the information in neural discharge times and exploits the frequency-dependent phase properties of the tuned filters in the auditory periphery along with the cross-auditory-nerve-fiber coincidence detection to extract temporal cues. The Modified Phase Opponency (MPO) proposed here alters the components of the PO model in such a way that the basic functionality of the PO model is maintained but the various properties of the model can be analyzed and modified independently of each other. This work presents a detailed mathematical formulation of the MPO model and the relation between the properties of the narrowband signal that needs to be detected and the properties of the MPO model. The MPO speech enhancement scheme is based on the premise that speech signals are composed of a combination of narrow band signals (i.e. harmonics) with varying amplitudes.;The MPO enhancement scheme outperforms many of the other speech enhancement techniques when evaluated using different objective quality measures. Automatic speech recognition experiments show that replacing noisy speech signals by the corresponding MPO-enhanced speech signals leads to an improvement in the recognition accuracies at low SNRs. The amount of improvement varies with the type of the corrupting noise. Perceptual experiments indicate that: (a) there is little perceptual difference in the MPO-processed clean speech signals and the corresponding original clean signals, and (b) the MPO-enhanced speech signals are preferred over the output of the other enhancement methods when the speech signals are corrupted by subway noise but the outputs of the other enhancement schemes are preferred when the speech signals are corrupted by car noise.

机译：这项工作解决的问题是增强被加性噪声破坏的语音信号，并改善嘈杂条件下自动语音识别器的性能。增强的语音信号还可以提高嘈杂条件下有听力障碍的听众和正常听众的语音清晰度。最初的相位自适应（PO）模型用于检测噪声中的音调，模拟信息处理在神经放电时间中，利用听觉周围的调谐滤波器的频率相关相位特性以及跨听觉神经纤维重合检测来提取时间线索。此处提出的“修改相位响应度”（MPO）可以更改PO模型的组件，以保持PO模型的基本功能，但可以彼此独立地分析和修改模型的各种属性。这项工作提出了MPO模型的详细数学公式，以及需要检测的窄带信号的属性与MPO模型的属性之间的关系。 MPO语音增强方案基于以下前提：语音信号由幅度可变的窄带信号（即谐波）的组合组成。当使用不同的客观质量度量进行评估时，MPO增强方案的性能优于许多其他语音增强技术。自动语音识别实验表明，用相应的MPO增强的语音信号替换嘈杂的语音信号会导致低SNR时识别精度的提高。改善的程度随噪声的类型而变化。感知实验表明：（a）经MPO处理的纯净语音信号和相应的原始纯净信号之间的感知差异很小，并且（b）当MPO处理后的纯净语音信号与其他增强方法的输出相比时，其优先级高于其他增强方法的输出语音信号受到地铁噪声的破坏，但是当语音信号受到汽车噪声的破坏时，最好采用其他增强方案的输出。

著录项

作者
Deshmukh, Om D.;
展开▼
作者单位

University of Maryland, College Park.;

展开▼
授予单位 University of Maryland, College Park.;
学科 Health Sciences Audiology.;Engineering Electronics and Electrical.;Health Sciences Speech Pathology.
学位 Ph.D.
年度 2006
页码 180 p.
总页数 180
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A model of auditory perception as front end for automatic speech recognition. [J] . Tchorz J, Kollmeier B The Journal of the Acoustical Society of America . 1999,第4aPta1期

机译：听觉感知模型作为自动语音识别的前端。
2. Development of Hindi speech stimuli to elicit auditory brainstem responses: Necessity and acoustic-phonetic considerations [J] . MOHAMMAD SHAMIM ANSARI, R. RANGASAYEE Hearing, balance and communication. . 2016,第3a4期

机译：印地语语音刺激的发展，以引起听觉脑干反应：必要性和声学注意事项
3. Development of Hindi speech stimuli to elicit auditory brainstem responses: Necessity and acoustic-phonetic considerations [J] . MOHAMMAD SHAMIM ANSARI, R. RANGASAYEE Hearing, balance and communication. . 2015,第3a4期

机译：印地语语音刺激的发展引发听觉脑干响应：必要性和声学 - 语音考虑
4. Robust gender-dependent acoustic-phonetic modelling in continuous speech recognition based on a new automatic male/female classification [C] . Vergin, R., Farhat, . 1996

机译：基于新的自动男女分类的连续语音识别中基于性别的稳健语音建模
5. Modeling auditory perception for robust speech recognition. [D] . Strope, Brian P. 1998

机译：建模听觉感知以增强语音识别能力。
6. Robust EEG-Based Decoding of Auditory Attention With High-RMS-Level Speech Segments in Noisy Conditions [O] . Lei Wang, Ed X. Wu, Fei Chen 2020

机译：基于危险的eeg的eeg的解码在嘈杂的条件下具有高rms级语音段的听觉注意力
7. Robust Gender-Dependent Acoustic-Phonetic Modelling In Continuous Speech Recognition Based On A New Automatic Male/Female Classification [O] . Rivarol Vergin, Azarshid Farhat, Douglas O&apos 1996

机译：基于新的男/女自动分类的连续语音识别中基于性别的依赖声学 - 声学建模

Synergy of acoustic-phonetics and auditory modeling towards robust speech recognition.

摘要

著录项

相似文献

相关主题

期刊订阅