首页> 外文学位 >Multisensor segmentation-based noise suppression for intelligibility improvement in MELP coders.
【24h】

Multisensor segmentation-based noise suppression for intelligibility improvement in MELP coders.

机译:基于多传感器分段的噪声抑制,可提高MELP编码器的清晰度。

获取原文
获取原文并翻译 | 示例

摘要

This thesis investigates the use of an auxiliary sensor, the GEMS device, for improving the quality of noisy speech and designing noise preprocessors to MELP speech coders. Use of auxiliary sensors for noise-robust ASR applications is also investigated to develop speech enhancement algorithms that use acoustic-phonetic properties of the speech signal.; A Bayesian risk minimization framework is developed that can incorporate the acoustic-phonetic properties of speech sounds and knowledge of human auditory perception into the speech enhancement framework. Two noise suppression systems are presented using the ideas developed in the mathematical framework. In the first system, an aharmonic comb filter is proposed for voiced speech where low-energy frequencies are severely suppressed while high-energy frequencies are suppressed mildly. The proposed system outperformed an MMSE estimator in subjective listening tests and DRT intelligibility test for MELP-coded noisy speech. The effect of aharmonic comb filtering on the linear predictive coding (LPC) parameters is analyzed using a missing data approach. Suppressing the low-energy frequencies without any modification of the high-energy frequencies is shown to improve the LPC spectrum using the Itakura-Saito distance measure.; The second system combines the aharmonic comb filter with the acoustic-phonetic properties of speech to improve the intelligibility of the MELP-coded noisy speech. Noisy speech signal is segmented into broad level sound classes using a multi-sensor automatic segmentation/classification tool, and each sound class is enhanced differently based on its acoustic-phonetic properties. The proposed system is shown to outperform both the MELPe noise preprocessor and the aharmonic comb filter in intelligibility tests when used in concatenation with the MELP coder.; Since the second noise suppression system uses an automatic segmentation/classification algorithm, exploiting the GEMS signal in an automatic segmentation/classification task is also addressed using an ASR approach. Current ASR engines can segment and classify speech utterances in a single pass; however, they are sensitive to ambient noise. Features that are extracted from the GEMS signal can be fused with the noisy MFCC features to improve the noise-robustness of the ASR system. In the first phase, a voicing feature is extracted from the clean speech signal and fused with the MFCC features. The actual GEMS signal could not be used in this phase because of insufficient sensor data to train the ASR system. (Abstract shortened by UMI.)
机译:本文研究了使用辅助传感器GEMS设备来提高嘈杂语音的质量并为MELP语音编码器设计噪声预处理器。还研究了将辅助传感器用于抗噪ASR应用,以开发利用语音信号的声学特性的语音增强算法。开发了一种贝叶斯风险最小化框架,该框架可以将语音的声学特性和人类听觉知识整合到语音增强框架中。利用数学框架中提出的思想,提出了两种噪声抑制系统。在第一系统中,提出了一种非谐梳状滤波器,用于有声语音,其中低能量频率被严重抑制而高能量频率被适度抑制。在主观听觉测试和DLP可懂度测试中,该系统在针对MELP编码的嘈杂语音方面优于MMSE估计器。使用缺失数据方法分析了非谐波梳状滤波对线性预测编码(LPC)参数的影响。使用Itakura-Saito距离测量方法,抑制低能量频率而不对高能量频率进行任何修改,可以改善LPC频谱。第二个系统将非谐波梳状滤波器与语音的声学特性结合在一起,以提高MELP编码的有声语音的清晰度。使用多传感器自动分段/分类工具,可将嘈杂的语音信号划分为多个级别的声音类别,并且每种声音类别都会根据其声学特性而得到不同的增强。与MELP编码器配合使用时,在清晰度测试中,所提出的系统表现优于MELPe噪声预处理器和非谐波梳状滤波器。由于第二个噪声抑制系统使用自动分段/分类算法,因此也可以使用ASR方法解决在自动分段/分类任务中利用GEMS信号的问题。当前的ASR引擎可以通过一次语音分割和分类语音。但是,它们对环境噪声敏感。从GEMS信号中提取的特征可以与嘈杂的MFCC特征融合在一起,以改善ASR系统的噪声鲁棒性。在第一阶段,从干净的语音信号中提取语音特征并将其与MFCC特征融合。由于传感器数据不足以训练ASR系统,因此在该阶段无法使用实际的GEMS信号。 (摘要由UMI缩短。)

著录项

  • 作者

    Demiroglu, Cenk.;

  • 作者单位

    Georgia Institute of Technology.;

  • 授予单位 Georgia Institute of Technology.;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 132 p.
  • 总页数 132
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号