首页> 外文会议>2012 IEEE Workshop on Spoken Language Technology. >Exploiting loudness dynamics in stochastic models of turn-taking
【24h】

Exploiting loudness dynamics in stochastic models of turn-taking

机译:在转弯随机模型中利用响度动力学

获取原文
获取原文并翻译 | 示例

摘要

Stochastic turn-taking models have traditionally been implemented as N-grams, which condition predictions on recent binary-valued speechon-speech contours. The current work re-implements this function using feed-forward neural networks, capable of accepting binary- as well as continuous-valued features; performance is shown to asymptotically approach that of the N-gram baseline as model complexity increases. The conditioning context is then extended to leverage loudness contours. Experiments indicate that the additional sensitivity to loudness considerably decreases average cross entropy rates on unseen data, by 0.03 bits per framing interval of 100 ms. This reduction is shown to make loudness-sensitive conversants capable of better predictions, with attention memory requirements at least 5 times smaller and responsiveness latency at least 10 times shorter than the loudness-insensitive baseline.
机译:传统上,随机转弯模型被实现为N-gram,它以最近的二进制值语音/非语音轮廓为条件进行预测。当前的工作是使用前馈神经网络重新实现此功能,该网络能够接受二进制值和连续值特征。随着模型复杂度的增加,性能逐渐显示出接近N元语法基线。然后扩展条件上下文以利用响度轮廓。实验表明,对响度的额外敏感性大大降低了看不见数据的平均交叉熵率,每100 ms的帧间隔降低了0.03位。这种减少表明可以使响度敏感的对话者能够更好地进行预测,与不响度敏感的基线相比,注意力记忆需求至少小5倍,响应潜伏期至少短10倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号