首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >An Attention Model for Hypernasality Prediction in Children with Cleft Palate
【24h】

An Attention Model for Hypernasality Prediction in Children with Cleft Palate

机译:腭裂腭裂性高兴预测的注意模型

获取原文

摘要

Hypernasality refers to the perception of abnormal nasal resonances in vowels and voiced consonants. Estimation of hypernasality severity from connected speech samples involves learning a mapping between the frame-level features and utterance-level clinical ratings of hypernasality. However, not all speech frames contribute equally to the perception of hypernasality. In this work, we propose an attention-based bidirectional long-short memory (BLSTM) model that directly maps the frame-level features to utterance-level ratings by focusing only on specific speech frames carrying hyper-nasal cues. The models performance is evaluated on the Americleft database containing speech samples of children with cleft palate and clinical ratings of hypernasality. We analyzed the attention weights over broad phonetic categories and found that the model yields results consistent with what is known in the speech science literature. Further, the correlation between the predicted and perceptual rating is found to be significant (r = 0.684, p < 0.001) and better than conventional BLSTMs trained using frame-wise and last-frame approaches.
机译:发廊是指元音和浊音辅音中异常鼻共振的感知。从连接的语音样本估计来自连接的语音样本的估计涉及学习帧级特征和发话机级临床额定值之间的映射。然而,并非所有语言帧同样贡献到了对上衣的看法。在这项工作中,我们提出了一种基于关注的双向的长短短记忆(BLSTM)模型,它通过仅关注携带超鼻线索的特定语音帧来直接将帧级功能映射到发话机级别的额定值。模型性能是在患有腭裂和临床额定值的儿童语音样本的americleft数据库中进行评估。我们分析了广泛的语音类别的注意力,发现模型产生的结果与语音科学文献中所知道的结果一致。此外,发现预测和感知评级之间的相关性是显着的(r = 0.684,p <0.001)并且比使用帧和最后帧方法训练的传统布斯特更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号