首页> 外文会议>IEEE Winter Conference on Applications of Computer Vision >AlignNet: A Unifying Approach to Audio-Visual Alignment
【24h】

AlignNet: A Unifying Approach to Audio-Visual Alignment

机译:AlignNet:视听对准的统一方法

获取原文

摘要

We present AlignNet, a model that synchronizes videos with reference audios undernon-uniform and irregularmis- alignments. AlignNet learns the end-to-end dense correspondence between each frame of a video and an audio. Our method is designed according to simple and well- established principles: attention, pyramidal processing, warping, and affinity function. Together with the model, we release a dancing dataset Dance50 for training and evaluation. Qualitative, quantitative and subjective evaluation results on dance-music alignment and speech-lip alignment demonstrate that our method far outperforms the state-of- the-art methods. Code, dataset and sample videos are available at our project page1.
机译:我们介绍了AlignNet,该模型可将视频与参考音频在非均匀和不规则对齐下进行同步。 AlignNet了解视频和音频的每一帧之间的端到端密集对应。我们的方法是根据简单且公认的原则设计的:注意力,金字塔处理,扭曲和亲和力函数。与模型一起,我们发布了一个舞蹈数据集Dance50用于训练和评估。对舞曲音乐对齐和言语嘴唇对齐的定性,定量和主观评估结果表明,我们的方法远远优于最新方法。代码,数据集和示例视频可在我们的项目页面上找到 1

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号