首页> 外文期刊>IEEE transactions on audio, speech and language processing >Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems
【24h】

Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems

机译:环境嗅探:健壮语音系统的噪声知识估计

获取原文
获取原文并翻译 | 示例

摘要

Automatic speech recognition systems work reasonably well under clean conditions but become fragile in practical applications involving real-world environments. To date, most approaches dealing with environmental noise in speech systems are based on assumptions concerning the noise, or differences in collecting and training on a specific noise condition, rather than exploring the nature of the noise. As such, speech recognition, speaker ID, or coding systems are typically retrained when new acoustic conditions are to be encountered. In this paper, we propose a new framework entitled Environmental Sniffing to detect, classify, and track acoustic environmental conditions. The first goal of the framework is to seek out detailed information about the environmental characteristics instead of just detecting environmental changes. The second goal is to organize this knowledge in an effective manner to allow smart decisions to direct subsequent speech processing systems. Our current framework uses a number of speech processing modules including a hybrid algorithm with T2-BIC segmentation, Gaussian mixture model/hidden Markov model (GMM/HMM)-based classification and noise language modeling to achieve effective noise knowledge estimation. We define a new information criterion that incorporates the impact of noise into Environmental Sniffing performance. We use an in-vehicle speech and noise environment as a test platform for our evaluations and investigate the integration of Environmental Sniffing for automatic speech recognition (ASR) in this environment. Noise sniffing experiments show that our proposed hybrid algorithm achieves a classification error rate of 25.51%, outperforming our baseline system by 7.08%. The sniffing framework is compared to a ROVER solution for automatic speech recognition (ASR) using different noise conditioned recognizers in terms of word error rate (WER) and CPU usage. Results show that the model matching scheme using the knowledge extr- acted from the audio stream by Environmental Sniffing achieves better performance than a ROVER solution both in accuracy and computation. A relative 11.1% WER improvement is achieved with a relative 75% reduction in CPU resources
机译:自动语音识别系统在干净的条件下可以正常工作,但在涉及现实环境的实际应用中会变得脆弱。迄今为止,大多数在语音系统中处理环境噪声的方法都是基于与噪声有关的假设,或在特定噪声条件下收集和训练的差异,而不是探究噪声的性质。这样,当遇到新的声学条件时,通常会重新训练语音识别,说话者ID或编码系统。在本文中,我们提出了一个名为“环境嗅探”的新框架来检测,分类和跟踪声学环境条件。该框架的第一个目标是寻找有关环境特征的详细信息,而不仅仅是检测环境变化。第二个目标是以有效的方式组织此知识,以允许明智的决定来指导后续的语音处理系统。我们当前的框架使用许多语音处理模块,包括具有T2-BIC分段的混合算法,基于高斯混合模型/隐马尔可夫模型(GMM / HMM)的分类和噪声语言建模,以实现有效的噪声知识估计。我们定义了一个新的信息标准,该标准将噪声的影响纳入环境嗅探性能中。我们使用车载语音和噪声环境作为评估的测试平台,并研究了环境嗅探在此环境中的集成,以实现自动语音识别(ASR)。噪声嗅探实验表明,我们提出的混合算法实现了25.51%的分类错误率,比基准系统高出7.08%。在字错误率(WER)和CPU使用率方面,使用不同的噪声条件识别器,将嗅探框架与用于自动语音识别(ASR)的ROVER解决方案进行了比较。结果表明,使用环境嗅探从音频流中提取的知识的模型匹配方案在准确性和计算上均比ROVER解决方案具有更好的性能。通过将CPU资源减少75%,可以实现WER相对改善11.1%

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号