首页> 外文OA文献 >Maximising audiovisual correlation with automatic lip tracking and vowel based segmentation
【2h】

Maximising audiovisual correlation with automatic lip tracking and vowel based segmentation

机译:通过自动的嘴唇跟踪和基于元音的分割,最大程度地提高视听相关性

摘要

In recent years, the established link between the various human communication production domains has become more widely utilised in the field of speech processing. In this work, a state of the art Semi Adaptive Appearance Model (SAAM) approach developed by the authors is used for automatic lip tracking, and an adapted version of our vowel based speech segmentation system is employed to automatically segment speech. Canonical Correlation Analysis (CCA) on segmented and non segmented data in a range of noisy speech environments finds that segmented speech has a significantly better audiovisual correlation, demonstrating the feasibility of our techniques for further development as part of a proposed audiovisual speech enhancement system.
机译:近年来,在语音处理领域中,各种人类通信生产领域之间已建立的联系已得到更广泛的利用。在这项工作中,由作者开发的最先进的半自适应外观模型(SAAM)方法用于自动嘴唇跟踪,并且采用了基于元音的语音分割系统的改进版本来自动分割语音。对一系列嘈杂的语音环境中的分段和非分段数据进行规范相关分析(CCA),发现分段语音具有明显更好的视听相关性,这证明了作为拟议的视听语音增强系统的一部分,我们进行进一步开发的技术的可行性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号