ADDING CONTROLLED AMOUNT OF NOISE TO IMPROVE RECOGNITION OF COMPRESSED AND SPECTRALLY DISTORTED SPEECH

机译：添加受控噪声量以提高压缩和光谱扭曲语音的识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper deals with the recognition of speech whose spectrum is notably distorted by lossy compression (namely MP3) or by some implementations of 'speech enhancement' techniques. We show that these non-linear treatments can introduce gaps in spectrum that significantly change the distribution of MFCCs and degrade performance of ASR. We propose a method that measures the level of spectrum distortion and use it for adding a controlled amount of noise to the signal. It effectively masks the gaps and helps namely in situations where the source and parameters of the distortion are not known and hence we cannot use a properly matched acoustic model. In spite of its simplicity, the method can improve significantly speech recognition of highly compressed or spectrally distorted signals. We demonstrate it in several large experiments conducted on publicly available speech databases, in two languages and for two types of spectral distortion.

机译：本文涉及识别演讲，其频谱由有损压缩（即MP3）或“语音增强”技术的一些实现而扭曲的频谱。我们表明这些非线性处理可以在光谱中引入间隙，从而显着改变MFCC的分布和降低ASR的性能。我们提出了一种测量频谱失真水平的方法，并使用它来向信号添加受控噪声量。它有效地掩盖了差距并在不知道失真的源和参数的情况下帮助，因此我们不能使用正确匹配的声学模型。尽管其简单性，该方法可以提高高度压缩或光谱扭曲信号的显着语音识别。我们在几种在公开的语音数据库，两种语言和两种类型的光谱失真进行的几个大型实验中展示了它。

著录项

来源
《IEEE International Conference on Acoustics, Speech, and Signal Processing》|2013年||共5页
会议地点
作者
Jan Nouza; Petr Cerva; Jan Silovsky;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词

相似文献

外文文献
中文文献
专利

1. The auditory-oriented spectral distortion for evaluating speech signals distorted by additive noises [J] . Mitsunori Mizumachi, Masato Akagi Acoustical science and technology . 2001,第5期

机译：面向听觉的频谱失真，用于评估因加性噪声而失真的语音信号
2. The auditory-oriented spectral distortion for evaluating speech signals distorted by additive noises [J] . Mitsunori Mizumachi, Masato Akagi The Journal of the Acoustical Society of Japan . 2000,第5期

机译：面向听觉的频谱失真，用于评估因加性噪声而失真的语音信号
3. Improving performance of spectral subtraction in speech recognition using a model for additive noise [J] . Yoma N.B., McInnes F.R. IEEE Transactions on Speech and Audio Proceeding . 1998,第6期

机译：使用加性噪声模型提高语音识别中频谱减法的性能
4. Adding controlled amount of noise to improve recognition of compressed and spectrally distorted speech [C] . Nouza Jan, Cerva Petr, Silovsky Jan IEEE International Conference on Acoustics, Speech and Signal Processing . 2013

机译：添加受控量的噪声以改善对压缩和频谱失真语音的识别
5. Compressive nonlinearity for representing speech spectral magnitude to improve noise robustness of automatic speech recognition . [D] . Wong, Brian. 2011

机译：压缩非线性表示语音频谱幅度提高语音自动识别的鲁棒性。
6. Speech Perception for Adult Cochlear Implant Recipients in a Realistic Background Noise: Effectiveness of Preprocessing Strategies and External Options for Improving Speech Recognition in Noise [O] . René H. Gifford, Lawrence J. Revit -1

机译：成人耳蜗植入者在现实背景噪声中的言语感知：预处理策略和外部选择改善噪声语音识别的有效性
7. The auditory-oriented spectral distortion for evaluating speech signals distorted by additive noises [O] . Mizumachi, Mitsunori, Akagi, Masato 2000

机译：面向听觉的频谱失真，用于评估由于加性噪声而失真的语音信号
8. Adding a Zero-Crossing Count to Spectral Information in Template-Based Speech Recognition [R] . Rudnicky, A. I., Waibel, A. H., Krishnan, N. 1982

机译：在基于模板的语音识别中为频谱信息添加过零计数

ADDING CONTROLLED AMOUNT OF NOISE TO IMPROVE RECOGNITION OF COMPRESSED AND SPECTRALLY DISTORTED SPEECH

摘要

著录项

相似文献

相关主题

期刊订阅