首页> 外文会议>International conference on intelligent data engineering and automated learning >Investigation of ART2-Based Audio-to-Visual Conversion for Multimedia Applications
【24h】

Investigation of ART2-Based Audio-to-Visual Conversion for Multimedia Applications

机译:用于多媒体应用的基于ART2的音频对视觉转换的研究

获取原文

摘要

Audio-to-visual synchronization is important for multimedia applications involving talking human, either natural or synthetic. Close correlation exists between the acoustic speech signal and visible lip movement that can be exploited in developing real-time audio-to-visual conversions. In this article, we apply ART2 and a multi-audio-frame technique to derive lip movement sequence from its corresponding audio speech stream. The training process of ART2 is fast and it is capable of learning new things without necessarily forgetting things learned in the past. In the case of multi-user adaptation, we proposed a system which uses one user's ART2 model as the reference model together with audio adapting and visual learning mechanism for new user adaptation. The audio adaptation maps new user's audio features into reference model audio features, and the visual learning makes the reference ART2 model learn the new speech characteristics of the new user. Experimental results had shown that the proposed ART2-based method is both fast and effective for single user and multiuser.
机译:视听同步对于涉及谈话人类的多媒体应用是自然或合成的多媒体应用。在发声信号和可见唇部运动之间存在密切相关性,可以在开发实时音频到视觉转换中的应用。在本文中,我们应用ART2和多音频帧技术,以从其对应的音频语音流中导出唇部运动序列。 Art2的培训过程很快,它能够学习新事物,而不需要忘记过去的东西。在多用户适配的情况下,我们提出了一个系统,该系统将一个用户的ART2模型作为参考模型以及用于新用户自适应的音频适应和视觉学习机制。音频适配将新用户的音频功能映射到参考模型音频功能中,视觉学习使参考ART2模型学习新用户的新语音特性。实验结果表明,所提出的基于ART2的方法对于单个用户和多用户来说都是快速有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号