首页> 外文会议>International Conference on Signal Processing and Communication Systems >Improving Children's Speech Recognition Through Time Scale Modification Based Speaking Rate Adaptation
【24h】

Improving Children's Speech Recognition Through Time Scale Modification Based Speaking Rate Adaptation

机译:通过基于时标修改的语速调整来改善儿童的语音识别

获取原文

摘要

In the work presented in this paper, we have explored the effect of speaking-rate adaptation on children's speech recognition using acoustic models trained on adults' speech. It is well known that, the shape of the vocal organs, pitch and speaking-rates are significantly different for adult and child speakers. Consequently, the recognition performance for children's speech in such mismatched setup is reported to be extremely poor. To address the acoustic mismatch resulting from the differences in pitch and vocal-tract geometry, a large number of studies have been reported that have presented a myriad of techniques. But, only a few works have studied the role of speaking-rate adaptation on children's speech recognition. Furthermore, those studies were performed on systems employing Gaussian mixture models. Motivated by these facts, we have explored speaking-rate adaptation in the context of systems employing deep neural network based acoustic modeling. Time-scale modification using an approach based on phase-independent iterative spectrogram inversion is employed for speaking-rate adaptation. Significant reductions in errors are noted by adapting the speaking-rates. Furthermore, the effect of combining speaking-rate adaptation with vocal-tract length normalization and pitch scaling is also studied. Additive improvements are obtained by combining the explored techniques with speaking-rate adaptation.
机译:在本文提出的工作中,我们使用针对成年人语音训练的声学模型,探讨了语速调整对儿童语音识别的影响。众所周知,成人和儿童说话者的声器官的形状,音调和发声率显着不同。因此,据报道在这种不匹配的设置中对儿童语音的识别性能非常差。为了解决由于音高和声道几何形状的差异而导致的声音不匹配,据报道,大量研究提出了无数的技术。但是,只有少数作品研究了语速适应对儿童语音识别的作用。此外,这些研究是在采用高斯混合模型的系统上进行的。基于这些事实,我们在采用基于深度神经网络的声学建模的系统中探索了语速自适应。使用基于相位独立的迭代频谱图反转的方法进行时标修改,以实现语速匹配。通过调整语速,可以显着减少错误。此外,还研究了将语速匹配与声道长度归一化和音调缩放相结合的效果。通过将探索的技术与语速适配相结合,可以获得更多的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号