首页> 外文OA文献 >Restoring The Missing Features of the Corrupted Speech using Linear Interpolation Methods
【2h】

Restoring The Missing Features of the Corrupted Speech using Linear Interpolation Methods

机译:使用线性插值方法恢复腐败语音的缺失特征

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

One of the main challenges in the Automatic Speech Recognition (ASR) is the noise. The performance of the ASR system reduces significantly if the speech is corrupted by noise. In spectrogram representation of a speech signal, after deleting low Signal to Noise Ratio (SNR) elements, the incomplete spectrogram is obtained. In this case, the speech recognizer should make modifications to the spectrogram in order to restore the missing elements, which is one direction. In another direction, speech recognizer should be able to restore the missing elements due to deleting low SNR elements before performing the recognition. This is can be done using different spectrogram reconstruction methods. In this paper, the geometrical spectrogram reconstruction methods suggested by some researchers are implemented as a toolbox. In these geometrical reconstruction methods, the linear interpolation along time or frequency methods are used to predict the missing elements between adjacent observed elements in the spectrogram. Moreover, a new linear interpolation method using time and frequency together is presented. The CMU Sphinx III software is used in the experiments to test the performance of the linear interpolation reconstruction method. The experiments are done under different conditions such as different lengths of the window and different lengths of utterances. Speech corpus consists of 20 males and 20 females; each one has two different utterances are used in the experiments. As a result, 80% recognition accuracy is achieved with 25% SNR ratio.
机译:自动语音识别(ASR)的主要挑战之一是噪声。如果语音被噪声破坏,则ASR系统的性能将大大降低。在语音信号的频谱图表示中,删除低信噪比(SNR)元素后,将获得不完整的频谱图。在这种情况下,语音识别器应对频谱图进行修改,以恢复丢失的元素,这是一个方向。在另一个方向上,由于在执行识别之前删除低SNR元素,语音识别器应该能够恢复丢失的元素。这可以使用不同的频谱图重建方法来完成。本文将一些研究人员建议的几何谱图重建方法作为工具箱来实现。在这些几何重构方法中,沿时间或频率的线性插值方法用于预测光谱图中相邻观察元素之间的缺失元素。此外,提出了一种同时使用时间和频率的线性插值方法。实验中使用了CMU Sphinx III软件来测试线性插值重构方法的性能。实验是在不同的条件下完成的,例如不同的窗口长度和不同的发声长度。语音语料库由20位男性和20位女性组成;在实验中,每个人都有两种不同的发音。结果,以25%的SNR比实现了80%的识别精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号