首页> 外文会议>IEEE International Conference on Acoustics, Speech, and Signal Processing >ADDING CONTROLLED AMOUNT OF NOISE TO IMPROVE RECOGNITION OF COMPRESSED AND SPECTRALLY DISTORTED SPEECH
【24h】

ADDING CONTROLLED AMOUNT OF NOISE TO IMPROVE RECOGNITION OF COMPRESSED AND SPECTRALLY DISTORTED SPEECH

机译:添加受控噪声量以提高压缩和光谱扭曲语音的识别

获取原文

摘要

This paper deals with the recognition of speech whose spectrum is notably distorted by lossy compression (namely MP3) or by some implementations of 'speech enhancement' techniques. We show that these non-linear treatments can introduce gaps in spectrum that significantly change the distribution of MFCCs and degrade performance of ASR. We propose a method that measures the level of spectrum distortion and use it for adding a controlled amount of noise to the signal. It effectively masks the gaps and helps namely in situations where the source and parameters of the distortion are not known and hence we cannot use a properly matched acoustic model. In spite of its simplicity, the method can improve significantly speech recognition of highly compressed or spectrally distorted signals. We demonstrate it in several large experiments conducted on publicly available speech databases, in two languages and for two types of spectral distortion.
机译:本文涉及识别演讲,其频谱由有损压缩(即MP3)或“语音增强”技术的一些实现而扭曲的频谱。我们表明这些非线性处理可以在光谱中引入间隙,从而显着改变MFCC的分布和降低ASR的性能。我们提出了一种测量频谱失真水平的方法,并使用它来向信号添加受控噪声量。它有效地掩盖了差距并在不知道失真的源和参数的情况下帮助,因此我们不能使用正确匹配的声学模型。尽管其简单性,该方法可以提高高度压缩或光谱扭曲信号的显着语音识别。我们在几种在公开的语音数据库,两种语言和两种类型的光谱失真进行的几个大型实验中展示了它。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号