...
首页> 外文期刊>Electrical, Control and Communication Engineering >Improving Speech Recognition Rate through Analysis Parameters
【24h】

Improving Speech Recognition Rate through Analysis Parameters

机译:通过分析参数提高语音识别率

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Speech signal is redundant and non-stationary by nature. Because of vocal tract inertness these variations are not very rapid and the signal can be considered as stationary in short segments. It is presumed that in short-time magnitude spectrum the most distinct information of speech is contained. This is the main reason for speech signal analysis in frame-by-frame manner. The analyzed speech signal is segmented into overlapping segments (so-called frames) for this purpose. Segments of 15-25 ms with the overlap of 10-15 ms are used usually. In this paper we present results of our investigation of analysis window length and frame shift influence on speech recognition rate. We have analyzed three different cepstral analysis approaches for this purpose: mel frequency cepstral analysis (MFCC), linear prediction cepstral analysis (LPCC) and perceptual linear prediction cepstral analysis (PLPC). The highest speech recognition rate was obtained using 10 ms length analysis window with the frame shift varying from 7.5 to 10 ms (regardless of analysis type). The highest increase of recognition rate was 2.5 %.
机译:语音信号本质上是冗余且不稳定的。由于声道惰性,这些变化不是很快,并且信号可以在短段中被认为是静止的。假定在短时幅度频谱中包含最鲜明的语音信息。这是逐帧分析语音信号的主要原因。为此,被分析的语音信号被分成重叠的段(所谓的帧)。通常使用15-25 ms的段与10-15 ms的重叠。在本文中,我们提出了分析窗口长度和移码对语音识别率影响的调查结果。为此,我们分析了三种不同的倒谱分析方法:梅尔频率倒谱分析(MFCC),线性预测倒谱分析(LPCC)和感知线性倒谱分析(PLPC)。使用10 ms长度的分析窗口可获得最高的语音识别率,其移码范围为7.5到10 ms(与分析类型无关)。识别率最高增加为2.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号