【24h】

Lecture speech recognition considering the speaking rate variation

机译:考虑语速变化的演讲语音识别

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

In a lecture speech recognition, performance of speech recognition system degrades when a speaking rate is increased. The reason of this degradation is a change of acoustic characteristics not only in frequency domain but also in time domain. Because of these changes, normalization or compensation of the speaking rate is important. In this paper, we propose a speaking rate compensation method which selects an optimal frame period and frame length using a likelihood criterion. This method changes the frame period and length to compensate the speaking rate. However, the optimal frame period and length are different in each utterance. Therefore, our proposed method conducts speech recognition with various frame periods and lengths and determines the optimal frame period and length for the target speech using the acoustic likelihood normalized by the frame period and language likelihood. In a recognition experiment using CSJ corpus, this method improves the performance for high speaking rate speech.
机译:在演讲语音识别中,当语音速率增加时,语音识别系统的性能下降。这种劣化的原因是不仅在频域中而且在时域中声学特性的变化。由于这些变化,正常化或补偿语速很重要。在本文中,我们提出一种说话率补偿方法,该方法使用似然准则选择最佳帧周期和帧长。此方法更改帧周期和长度以补偿语速。但是,每种发声的最佳帧周期和长度都不同。因此,我们提出的方法以各种帧周期和长度进行语音识别,并使用由帧周期和语言似然性归一化的声学似然性来确定目标语音的最佳帧周期和长度。在使用CSJ语料库的识别实验中,此方法提高了高语速语音的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号