首页> 外文学位 >Evaluation of speech enhancement techniques for speaker recognition in noisy environments.
【24h】

Evaluation of speech enhancement techniques for speaker recognition in noisy environments.

机译:在嘈杂环境中评估语音增强技术以进行说话人识别。

获取原文
获取原文并翻译 | 示例

摘要

In automatic speaker recognition (ASR) applications, the presence of background noise severely degrades recognition performance. There is a strong demand for speech enhancement algorithms capable of removing background noise. In this thesis, a Gaussian mixture model based automatic speaker recognition system is used for evaluating the performance of five different speech enhancement techniques. Previously, it was shown that these techniques improved the SNR of the speech signals corrupted by noise but their effect on the speaker recognition performance was not fully investigated. In this work, we implement these enhancement techniques and evaluate their performance as preprocessing blocks to the ASR engine. We evaluate the performance based on speaker recognition accuracy, average segmental signal-to-noise ratio and perceptual evaluation of speech quality (PESQ) scores. We combine clean speech from the TIMIT database with eight different types of noise from the NOISEX-92 database representing synthetic and natural background noise samples and analyze the overall system performance. Simulation results show that the system is capable of reducing noise with little speech degradation and the overall recognition performance can be improved at a range of different signal-to-noise ratios (SNR) with different noise types. Furthermore, results show that different enhancement techniques have different strengths and weaknesses, depending on their application and the background noise type.
机译:在自动说话人识别(ASR)应用中,背景噪声的存在会严重降低识别性能。对能够去除背景噪声的语音增强算法有强烈的需求。本文采用基于高斯混合模型的自动说话人识别系统来评估五种不同语音增强技术的性能。先前已经表明,这些技术改善了被噪声破坏的语音信号的SNR,但是尚未充分研究它们对说话者识别性能的影响。在这项工作中,我们实施了这些增强技术并评估了它们作为ASR引擎预处理块的性能。我们根据说话者识别准确性,平均分段信噪比和语音质量(PESQ)分数的感知评估来评估性能。我们将TIMIT数据库中的干净语音与NOISEX-92数据库中代表合成和自然背景噪声样本的八种不同类型的噪声相结合,并分析了整个系统的性能。仿真结果表明,该系统能够减少语音干扰,并降低语音质量,并且在不同噪声类型的不同信噪比(SNR)范围内,可以提高整体识别性能。此外,结果表明,不同的增强技术根据其应用和背景噪声类型而具有不同的优势和劣势。

著录项

  • 作者

    El-Solh, Abdel-Aziz.;

  • 作者单位

    Carleton University (Canada).;

  • 授予单位 Carleton University (Canada).;
  • 学科 Engineering Electronics and Electrical.
  • 学位 M.A.Sc.
  • 年度 2006
  • 页码 124 p.
  • 总页数 124
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

  • 入库时间 2022-08-17 11:39:52

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号