首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >A pitch extraction algorithm tuned for automatic speech recognition
【24h】

A pitch extraction algorithm tuned for automatic speech recognition

机译:调整音调提取算法以实现自动语音识别

获取原文

摘要

In this paper we present an algorithm that produces pitch and probability-of-voicing estimates for use as features in automatic speech recognition systems. These features give large performance improvements on tonal languages for ASR systems, and even substantial improvements for non-tonal languages. Our method, which we are calling the Kaldi pitch tracker (because we are adding it to the Kaldi ASR toolkit), is a highly modified version of the getf0 (RAPT) algorithm. Unlike the original getf0 we do not make a hard decision whether any given frame is voiced or unvoiced; instead, we assign a pitch even to unvoiced frames while constraining the pitch trajectory to be continuous. Our algorithm also produces a quantity that can be used as a probability of voicing measure; it is based on the normalized autocorrelation measure that our pitch extractor uses. We present results on data from various languages in the BABEL project, and show a large improvement over systems without tonal features and systems where pitch and POV information was obtained from SAcC or getf0.
机译:在本文中,我们提出了一种算法,该算法可产生音高和发声概率估计值,以用作自动语音识别系统中的功能。这些功能为ASR系统的音调语言带来了很大的性能改进,甚至为非音调语言带来了实质性的改进。我们称为Kaldi音调跟踪器的方法(因为我们将其添加到Kaldi ASR工具箱中)是getf0(RAPT)算法的高度修改版本。与原始的getf0不同,我们不会对任何给定的帧是浊音还是清音做出艰难的决定。取而代之的是,我们在将音高轨迹限制为连续的同时,甚至将音高分配给未发声的帧。我们的算法还产生了可以用作发声测量概率的数量。它基于我们的音高提取器使用的归一化自相关度量。我们介绍了BABEL项目中来自各种语言的数据结果,并显示了对没有音调特征的系统和从SAcC或getf0获得音高和POV信息的系统的巨大改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号