首页> 外文会议> >Towards a high quality Finnish talking head
【24h】

Towards a high quality Finnish talking head

机译:迈向高品质的芬兰会说话的人

获取原文

摘要

We describe how our Finnish talking head was improved by using a new auditory speech synthesis method based on neural networks and optimal synchronization of the facial speech animation and the audio signal. In our first version of the talking head, the user typed in text and synthesized auditory speech and synchronized facial animation were created automatically. We combine a 3D facial model with a commercial auditory text-to-speech synthetizer (TTS). The auditory speech is produced by concatenating pre-recorded samples of natural speech according to a set of rules. The quality of the current speech synthesis is not yet adequate. A new strategy has been developed to improve the TTS and to integrate auditory synthesizer synchronization, especially when hardware capabilities are limited. We are developing a new method to achieve an optimal synchronization, independent of the platform used. This method is based on predictive visual synthesis. The new synchronization method gives us better control over audio-visual speech synthesis in the time domain. Using the diphone duration, we can use a more realistic interpolation function between the visemes. Thus, we can also take into account coarticulation effects.
机译:我们描述了如何使用一种新的基于神经网络的听觉语音合成方法以及面部语音动画和音频信号的最佳同步来改善我们的芬兰人的头部。在我们的第一个版本的会说话的头中,用户键入文本并自动创建合成的听觉语音和同步的面部动画。我们将3D面部模型与商业听觉语音合成器(TTS)结合在一起。听觉语音是通过根据一组规则将预先录制的自然语音样本连接起来而产生的。当前语音合成的质量还不够。已经开发出一种新的策略来改善TTS并集成听觉合成器同步,尤其是在硬件功能有限的情况下。我们正在开发一种新的方法,以实现最佳同步,而与所使用的平台无关。此方法基于预测性视觉综合。新的同步方法使我们可以更好地控制时域中的视听语音合成。使用diphone持续时间,我们可以在视位之间使用更逼真的插值函数。因此,我们也可以考虑协同发音的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号