首页> 外文OA文献 >Speech Enhancement Based on Full-Sentence Correlation and Clean Speech Recognition
【2h】

Speech Enhancement Based on Full-Sentence Correlation and Clean Speech Recognition

机译:基于全语句相关和清晰语音识别的语音增强

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Conventional speech enhancement methods, based on frame, multi-frame or segment estimation, require knowledge about the noise. This paper presents a new method which aims to reduce or effectively remove this requirement. It is shown that, by using the Zero-mean Normalized Correlation Coefficient (ZNCC) as the comparison measure, and by extending the effective length of speech segment matching to sentencelong speech utterances, it is possible to obtain an accurate speech estimate from noise without requiring specific knowledge about the noise. The new method, thus, could be used to deal with unpredictable noise or noise without proper training data. This paper is focused on realizing and evaluating this potential. We propose a novel realization that integrates full-sentence speech correlation with clean speech recognition, formulated as a constrained maximization problem, to overcome the data sparsity problem. Then we propose an efficient implementation algorithm to solve this constrained maximization problem, to produce speech sentence estimates. For evaluation, we build the new system on one training data set and test it on two different test data sets across two databases, for a range of different noises including highly nonstationary ones. It is shown that the new approach, without any estimation of the noise, is able to significantly outperform conventional methods which use optimized noise tracking, in terms of various objective measures including automatic speech recognition.
机译:基于帧,多帧或片段估计的常规语音增强方法需要有关噪声的知识。本文提出了一种旨在减少或有效消除这一要求的新方法。结果表明,通过使用零均值归一化相关系数(ZNCC)作为比较措施,并将语音段匹配的有效长度扩展到句子长的语音,可以从噪声中获得准确的语音估计,而无需有关噪音的具体知识。因此,该新方法可用于处理不可预测的噪声或没有适当训练数据的噪声。本文着重于实现和评估这种潜力。我们提出了一种新颖的实现方法,该方法将全句语音相关性与干净语音识别相结合,并被公式化为约束最大化问题,以克服数据稀疏性问题。然后,我们提出了一种有效的实现算法来解决该约束最大化问题,以产生语音句子估计。为了进行评估,我们将新系统构建在一个训练数据集上,并在两个数据库的两个不同测试数据集上对其进行测试,以应对一系列不同的噪声,包括高度不稳定的噪声。结果表明,在包括自动语音识别在内的各种客观指标方面,这种新方法无需对噪声进行任何估计,就能够大大优于使用优化噪声跟踪的传统方法。

著录项

  • 作者

    Ming Ji; Crookes Daniel;

  • 作者单位
  • 年度 2017
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号