首页> 外文会议>International Symposium on Intelligent Signal Processing and Communications >Study of Relationships between Intra-speaker's Speech Variability and Speech Recognition Performance
【24h】

Study of Relationships between Intra-speaker's Speech Variability and Speech Recognition Performance

机译:语音变异性语音变异性和语音识别性能之间的关系研究

获取原文

摘要

Even if a speaker uses a speaker-dependent speech recognition system, speech recognition performance varies. For this reason, speech quality is varied by some factors, which are including emotion, background noise, and so on, even though the speaker and utterance remain constant. However, the relationships between intra-speaker's speech variability and speech recognition performance are not clear. Hence, we focus on the intra-speaker's speech variability which affects the speech recognition performances. To investigate these relationships, we have been collecting speech data since November 2002. Using a part of the speech corpus, we conducted speech recognition experiments. In this paper, we analyze the relationships between intra-speaker's speech variability and the phoneme accuracy by using the correlation analysis. For factors of the correlation analysis, we use a number of errors, a speaking rate, a likelihood. Analysis results show a strong correlation between the number of the substitution errors and the phoneme accuracy although the correlations of the number of the deletion and the insertion errors are low. Therefore, it is considered that there are overlaps between phonemes since the feature parameters vary at each speaking rate. For improving the phoneme accuracy, it is needed that we study a method which discriminates phonemes. On the other hand, although the correlation between the phoneme accuracy and the speaking rate seems to be low, a strong correlation between the speaking rate and the number of deletion errors and insertion errors are found. Since the number of the insertion errors and the number of the deletion errors were in the counterbalance relation, the correlation between the speaking rate and the phoneme accuracy was low. However, we consider that it is needed to normalize the speaking rate because the speaking rate influences on the number of the deletion and the insertion errors.
机译:即使演讲者使用扬声器相关的语音识别系统,语音识别性能也会有所不同。因此,即使扬声器和话语保持不变,这些因素包括情感,背景噪声等的一些因素,包括情感,背景噪声等。然而,扬声器语音变异性和语音识别性能之间的关系尚不清楚。因此,我们专注于扬声器内的语音变异,影响语音识别性能。为了调查这些关系,我们自2002年11月以来一直在收集语音数据。使用演讲语料库的一部分,我们进行了语音识别实验。在本文中,我们通过使用相关分析来分析扬声器语音变异性和音素精度之间的关系。对于相关分析的因素,我们使用许多错误,发言率,可能性。分析结果显示替换误差的数量与音素精度之间的强烈相关性,尽管删除的数量和插入误差的相关性低。因此,认为音素之间存在重叠,因为特征参数在每个讲速率下变化。为了提高音素准确性,我们需要研究一种辨别音素的方法。另一方面,虽然音素精度与讲话率之间的相关性似乎是低的,但是发现说话率与删除误差和插入错误之间的强相关性。由于插入误差的数量和删除误差的数量处于平衡关系,因此说话率与音素精度之间的相关性低。然而,我们认为需要正常化说话率,因为说话率对删除和插入误差的数量影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号