Let's Play Music: Audio-Driven Performance Video Generation

机译：让我们玩音乐：音频驱动的性能视频生成

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a new task named Audio-driven Performance Video Generation (APVG), which aims to synthesize the video of a person playing a certain instrument guided by a given music audio clip. It is a challenging task to generate the high-dimensional temporal consistent videos from low-dimensional audio modality. In this paper, we propose a multi-staged framework to generate realistic and synchronized performance video from given music. Firstly, we provide both global appearance and local spatial information by generating the coarse videos and keypoints of body and hands from a given music respectively. Then, we propose to transform the generated keypoints to heatmap via a differentiable space transformer, since the heatmap provides more spatial information but is harder to generate directly from audio. Finally, we propose a Structured Temporal UNet (STU) to extract both intra-frame structured information and interframe temporal consistency. They are obtained via graph-based structure module, and CNN-GRU based high-level temporal module respectively for final video generation. Comprehensive experiments validate the effectiveness of our proposed framework.

机译：我们提出了一项名为Audio-Driven Performance Video Generation（APVG）的新任务，该任务旨在综合播放由给定音乐音频剪辑的某个仪器的人的视频。从低维音频模型生成高维时间一致视频是一个具有挑战性的任务。在本文中，我们提出了一种多阶段的框架来产生来自给定音乐的现实和同步性能视频。首先，我们通过分别生成身体和手中的粗视频和关键点，提供全局外观和局部空间信息。然后，我们建议通过可分离的空间变压器将所生成的关键点转换为热示例，因为Heatmap提供了更多的空间信息，但更难直接从音频生成。最后，我们提出了一种结构化的颞率UNET（STU）来提取帧内结构化信息和帧间时间一致性。它们通过基于图形的结构模块和基于CNN-GRU的高级时间模块获得了最终视频。综合实验验证了我们提出的框架的有效性。

著录项

来源
《International Conference on Pattern Recognition》|2021年|3574-3581|共8页
会议地点
作者
Hao Zhu; Yi Li; Feixia Zhu; Aihua Zheng; Ran He;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Instruments; Transforms; Synchronization; Data mining; Task analysis; Space heating;

机译：仪器;变换;同步;数据挖掘;任务分析;空间加热;

相似文献

外文文献
中文文献
专利

1. On detecting the playingon-playing activity of musicians in symphonic music videos [J] . Alessio Bazzica, Cynthia C.S. Liem, Alan Hanjalic Computer vision and image understanding . 2016,第mara期

机译：关于在交响音乐视频中检测音乐家的演奏/不演奏活动
2. Differential effects of wakeful rest, music and video game playing on working memory performance in the n-back task [J] . Maxim S. Kuschpel, Shuyan Liu, Daniel J. Schad, Frontiers in Psychology . 2015,第4期

机译：唤醒休息，音乐和视频游戏对正向任务中工作记忆性能的不同影响
3. Performance Analysis of Different Kernels of Support Vector Machine and Self-organizing Maps for Classifying Musical and Non-musical Personal Videos [J] . Pratap Sanap, Shaila D. Apte Indian Journal of Science and Technology . 2019,第14期

机译：支持向量机和自组织映射对音乐和非音乐个人视频分类的不同内核的性能分析
4. See and listen: Score-informed association of sound tracks to players in chamber music performance videos [C] . Bochen Li, Karthik Dinesh, Zhiyao Duan, IEEE International Conference on Acoustics, Speech and Signal Processing . 2017

机译：观看和收听：室内音乐表演视频中与乐谱有关的音轨与播放器的关联
5. Overworlds, Towns, and Battles: How Music Develops the Worlds of Role-playing Video Games [D] . Rossetti, Gregory James. 2019

机译：过度世界，城镇和战斗：音乐如何发展角色播放的视频游戏世界
6. Differential effects of wakeful rest music and video game playing on working memory performance in the n-back task [O] . Maxim S. Kuschpel, Shuyan Liu, Daniel J. Schad, -1

机译：唤醒休息音乐和视频游戏对正向任务中工作记忆性能的不同影响
7. Observing entrainment in music performance : video-based observational analysis of Indian musicians’ tanpura playing and beat marking. [O] . Clayton Martin 2007

机译：观察音乐表演的诱因：基于视频的印度音乐家的tanpura演奏和节拍标记的观察分析。

Let's Play Music: Audio-Driven Performance Video Generation

摘要

著录项

相似文献

相关主题

期刊订阅