首页> 外文期刊>Journal of VLSI signal processing >Speech-to-Lip Movement Synthesis by Maximizing Audio-Visual Joint Probability Based on the EM Algorithm
【24h】

Speech-to-Lip Movement Synthesis by Maximizing Audio-Visual Joint Probability Based on the EM Algorithm

机译:基于EM算法的视听联合概率最大化口对口运动合成

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we investigate a Hidden Markov Model (HMM)-based method to drive a lip movement sequence with input speech. In a previous study, we have already investigated a mapping method based on the Viterbi decoding algorithm which converts an input speech signal to a lip movement sequence through the most likely HMM state sequence using audio HMMs. However, the method can result in errors due to incorrectly decoded HMM states. This paper proposes a method to re-estimate visual parameters using HMMs of audio--visual joint probability using the Expectation-Maximization (EM) algorithm. In the experiments, the proposed mapping method results in a 26% error reduction when compared to the Viterbi-based algorithm at incorrectly decoded bilabial consonants.
机译:在本文中,我们研究了一种基于隐马尔可夫模型(HMM)的方法来驱动带有输入语音的嘴唇运动序列。在先前的研究中,我们已经研究了基于维特比解码算法的映射方法,该方法通过使用音频HMM通过最可能的HMM状态序列将输入语音信号转换为唇部运动序列。但是,由于错误地解码了HMM状态,该方法可能导致错误。本文提出了一种使用期望最大最大化(EM)算法使用视听联合概率HMM重新估计视觉参数的方法。在实验中,与基于Viterbi的算法在错误解码的双唇辅音上相比,该映射方法可将错误减少26%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号