首页> 外文学位 >Synergy of acoustic-phonetics and auditory modeling towards robust speech recognition.
【24h】

Synergy of acoustic-phonetics and auditory modeling towards robust speech recognition.

机译:语音和听觉建模对强大语音识别的协同作用。

获取原文
获取原文并翻译 | 示例

摘要

The problem addressed in this work is that of enhancing speech signals corrupted by additive noise and improving the performance of automatic speech recognizers in noisy conditions. The enhanced speech signals can also improve the intelligibility of speech in noisy conditions for human listeners with hearing impairment as well as for normal listeners.;The original Phase Opponency (PO) model, proposed to detect tones in noise, simulates the processing of the information in neural discharge times and exploits the frequency-dependent phase properties of the tuned filters in the auditory periphery along with the cross-auditory-nerve-fiber coincidence detection to extract temporal cues. The Modified Phase Opponency (MPO) proposed here alters the components of the PO model in such a way that the basic functionality of the PO model is maintained but the various properties of the model can be analyzed and modified independently of each other. This work presents a detailed mathematical formulation of the MPO model and the relation between the properties of the narrowband signal that needs to be detected and the properties of the MPO model. The MPO speech enhancement scheme is based on the premise that speech signals are composed of a combination of narrow band signals (i.e. harmonics) with varying amplitudes.;The MPO enhancement scheme outperforms many of the other speech enhancement techniques when evaluated using different objective quality measures. Automatic speech recognition experiments show that replacing noisy speech signals by the corresponding MPO-enhanced speech signals leads to an improvement in the recognition accuracies at low SNRs. The amount of improvement varies with the type of the corrupting noise. Perceptual experiments indicate that: (a) there is little perceptual difference in the MPO-processed clean speech signals and the corresponding original clean signals, and (b) the MPO-enhanced speech signals are preferred over the output of the other enhancement methods when the speech signals are corrupted by subway noise but the outputs of the other enhancement schemes are preferred when the speech signals are corrupted by car noise.
机译:这项工作解决的问题是增强被加性噪声破坏的语音信号,并改善嘈杂条件下自动语音识别器的性能。增强的语音信号还可以提高嘈杂条件下有听力障碍的听众和正常听众的语音清晰度。最初的相位自适应(PO)模型用于检测噪声中的音调,模拟信息处理在神经放电时间中,利用听觉周围的调谐滤波器的频率相关相位特性以及跨听觉神经纤维重合检测来提取时间线索。此处提出的“修改相位响应度”(MPO)可以更改PO模型的组件,以保持PO模型的基本功能,但可以彼此独立地分析和修改模型的各种属性。这项工作提出了MPO模型的详细数学公式,以及需要检测的窄带信号的属性与MPO模型的属性之间的关系。 MPO语音增强方案基于以下前提:语音信号由幅度可变的窄带信号(即谐波)的组合组成。当使用不同的客观质量度量进行评估时,MPO增强方案的性能优于许多其他语音增强技术。自动语音识别实验表明,用相应的MPO增强的语音信号替换嘈杂的语音信号会导致低SNR时识别精度的提高。改善的程度随噪声的类型而变化。感知实验表明:(a)经MPO处理的纯净语音信号和相应的原始纯净信号之间的感知差异很小,并且(b)当MPO处理后的纯净语音信号与其他增强方法的输出相比时,其优先级高于其他增强方法的输出语音信号受到地铁噪声的破坏,但是当语音信号受到汽车噪声的破坏时,最好采用其他增强方案的输出。

著录项

  • 作者

    Deshmukh, Om D.;

  • 作者单位

    University of Maryland, College Park.;

  • 授予单位 University of Maryland, College Park.;
  • 学科 Health Sciences Audiology.;Engineering Electronics and Electrical.;Health Sciences Speech Pathology.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 180 p.
  • 总页数 180
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号