Pose Estimation-Assisted Dance Tracking System Based on Convolutional Neural Network

Jin Mul

摘要

In the field of music-driven, computer-assisted dance movement generation, traditional music movement adaptations and statistical mapping models have the following problems: Firstly, the dance sequences generated by the model are not powerful enough to fit the music itself. Secondly, the integrity of the dance movements produced is not sufficient. Thirdly, it is necessary to improve the suppleness and rationality of long-term dance sequences. Fourthly, traditional models cannot produce new dance movements. How to create smooth and complete dance gesture sequences after music is a problem that needs to be investigated in this paper. To address these problems, we design a deep learning dance generation algorithm to extract the association between sound and movement characteristics. During the feature extraction phase, rhythmic features extracted from music and audio beat features are used as musical features, and coordinates of the main points of human bones extracted from dance videos are used for training as movement characteristics. During the model building phase, the model's generator module is used to achieve a basic mapping of music and dance movements and to generate gentle dance gestures. The identification module is used to achieve consistency between dance and music. The self-encoder module is used to make the audio function more representative. Experimental results on the DeepFashion dataset show that the generated model can synthesize the new view of the target person in any human posture of a given posture, complete the transformation of different postures of the same person, and retain the external features and clothing textures of the target person. Using a whole-to-detail generation strategy can improve the final video composition. For the problem of incoherent character movements in video synthesis, we propose to optimize the character movements by using a generative adversarial network, specifically by inserting generated motion compensation frames into the incoherent movement sequences to improve the smoothness of the synthesized video.

机译：在音乐驱动、计算机辅助舞蹈动作生成领域，传统的音乐动作改编和统计映射模型存在以下问题：首先，模型生成的舞蹈序列不够强大，无法与音乐本身拟合。其次，产生的舞蹈动作的完整性是不够的。第三，要提高长期舞蹈序列的柔顺性和合理性。第四，传统模式无法产生新的舞蹈动作。如何在音乐之后创建流畅、完整的舞蹈手势序列是本文需要研究的问题。为了解决这些问题，我们设计了一种深度学习舞蹈生成算法来提取声音与运动特征之间的关联。在特征提取阶段，从音乐和音频节拍特征中提取的节奏特征作为音乐特征，从舞蹈视频中提取的人骨要点坐标作为动作特征进行训练。在模型构建阶段，模型的生成器模块用于实现音乐和舞蹈动作的基本映射，并生成柔和的舞蹈手势。识别模块用于实现舞蹈和音乐的一致性。采用自编码器模块，使音频功能更具代表性。在DeepFashion数据集上的实验结果表明，生成的模型可以合成目标人在给定姿势的任何人体姿势下的新视图，完成同一人不同姿势的转换，并保留目标人的外部特征和服装纹理。使用从整体到细节的生成策略可以改善最终的视频构图。针对视频合成中角色运动不连贯的问题，提出利用生成对抗网络对角色运动进行优化，特别是将生成的运动补偿帧插入到不连贯的运动序列中，以提高合成视频的流畅度。

Pose Estimation-Assisted Dance Tracking System Based on Convolutional Neural Network

摘要

著录项

引文网络

相关主题

期刊订阅