首页> 外文会议>Annual conference of the International Speech Communication Association >Automatic estimation of the first two subglottal resonances in children's speech with application to speaker normalization in limited-data conditions
【24h】

Automatic estimation of the first two subglottal resonances in children's speech with application to speaker normalization in limited-data conditions

机译:自动估计儿童语音中的前两个声门下共振,并将其应用于有限数据条件下的说话人归一化

获取原文

摘要

This paper proposes an automatic algorithm for estimating the first two subglottal resonances (SGRs)-Sg1 and Sg2-from continuous speech of children, and applies it to automatic speaker normalization in mismatched, limited-data conditions. The proposed algorithm is based on the observation that Sg1 and Sg2 form phonological vowel feature boundaries, and is motivated by our recent SGR estimation algorithm for adults. The algorithm is trained and evaluated, respectively, on 25 and 9 children, aged between 7 and 18 years. The average RMS errors incurred in estimating Sg1 and Sg2 are 55 and 144 Hz, respectively. By applying the proposed algorithm to a connected digits speech recognition task, it is shown that: 1) a linear frequency warping using Sg1 or Sg2 is comparable to or better than maximum likelihood-based vocal tract length normalization (ML-VTLN), 2) the performance of SGR-based frequency warping is less content dependent than that of ML-VTLN, and 3) SGR-based frequency warping can be integrated into ML-VTLN to yield a statistically-significant improvement in performance.
机译:本文提出了一种自动算法,用于从儿童的连续语音中估计前两个声门下共振(SGRs)-Sg1和Sg2-,并将其应用于不匹配的有限数据条件下的自动说话人归一化。所提出的算法是基于Sg1和Sg2形成语音元音特征边界的观察结果,并且受我们最近针对成年人的SGR估计算法的启发。该算法分别针对25至9名7至18岁的儿童进行了训练和评估。估计Sg1和Sg2所引起的平均RMS误差分别为55和144 Hz。通过将所提出的算法应用于连接数字语音识别任务,表明:1)使用Sg1或Sg2进行的线性频率弯曲与基于最大似然性的声道长度归一化(ML-VTLN)相当或更好,2)与ML-VTLN相比,基于SGR的频率扭曲的内容依赖性较小; 3)可以将基于SGR的频率扭曲集成到ML-VTLN中,以产生统计学上显着的性能提升。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号