首页> 外文期刊>EURASIP journal on audio, speech, and music processing >On the Utility of Syllable-Based Acoustic Models for Pronunciation Variation Modelling
【24h】

On the Utility of Syllable-Based Acoustic Models for Pronunciation Variation Modelling

机译:基于音节的声学模型在语音变化建模中的效用

获取原文
           

摘要

Recent research on the TIMIT corpus suggests that longer-length acoustic models are more appropriate for pronunciation variation modelling than the context-dependent phones that conventional automatic speech recognisers use. However, the impressive speech recognition results obtained with longer-length models on TIMIT remain to be reproduced on other corpora. To understand the conditions in which longer-length acoustic models result in considerable improvements in recognition performance, we carry out recognition experiments on both TIMIT and the Spoken Dutch Corpus and analyse the differences between the two sets of results. We establish that the details of the procedure used for initialising the longer-length models have a substantial effect on the speech recognition results. When initialised appropriately, longer-length acoustic models that borrow their topology from a sequence of triphones cannot capture the pronunciation variation phenomena that hinder recognition performance the most.
机译:TIMIT语料库的最新研究表明,与传统自动语音识别器使用的上下文相关电话相比,较长长度的声学模型更适合于语音变化建模。但是,在TIMIT上使用较长长度的模型获得的令人印象深刻的语音识别结果仍有待在其他语料库上重现。为了了解更长的声学模型可导致识别性能显着提高的条件,我们在TIMIT和Spoken Dutch语料库上进行了识别实验,并分析了两组结果之间的差异。我们确定用于初始化较长长度模型的过程的详细信息会对语音识别结果产生重大影响。如果进行适当初始化,则较长的声学模型会从三音节序列中借用其拓扑结构,因此无法捕获最大程度地影响识别性能的发音变化现象。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号