首页> 外文期刊>Computers & Graphics >Adversarial gesture generation with realistic gesture phasing
【24h】

Adversarial gesture generation with realistic gesture phasing

机译:具有现实姿态阶段的对抗姿态生成

获取原文
获取原文并翻译 | 示例

摘要

Conversational virtual agents are increasingly common and popular, but modeling their non-verbal behavior is a complex problem that remains unsolved. Gesture is a key component of speech-accompanying behavior but is difficult to model due to its non-deterministic and variable nature. We explore the use of a generative adversarial training paradigm to map speech to 3D gesture motion. We define the gesture generation problem as a series of smaller sub-problems, including plausible gesture dynamics, realistic joint configurations, and diverse and smooth motion. Each sub-problem is monitored by separate adversaries. For the problem of enforcing realistic gesture dynamics in our output, we train three classifiers with different levels of detail to automatically detect gesture phases. We hand-annotate and evaluate over 3.8 hours of gesture data for this purpose, including samples of a second speaker for comparing and validating our results. We find adversarial training to be superior to the use of a standard regression loss and discuss the benefit of each of our training objectives. We recorded a dataset of over 6 hours of natural, unrehearsed speech with high-quality motion capture, as well as audio and video recording. (C) 2020 Elsevier Ltd. All rights reserved.
机译:会话虚拟代理越来越普遍和流行,但是建模他们的非言语行为是一个复杂的问题,该问题仍未解决。手势是伴随行为的关键组成部分,但由于其非确定性和可变性质,难以模型。我们探索使用生成的对抗训练范例来将语音映射到3D手势运动。我们将手势生成问题定义为一系列较小的子问题,包括合理的手势动力学,现实的联合配置和多样化和平滑的运动。通过单独的对手监测每个子问题。对于在我们的输出中执行现实手势动态的问题,我们培训三个分类器,具有不同级别的细节以自动检测手势阶段。我们手工注释和评估超过3.8小时的手势数据,包括用于比较和验证我们的结果的第二个扬声器的样本。我们发现对抗性培训优于使用标准回归损失,并讨论每个培训目标的利益。我们录制了一个超过6小时的自然,无法评价的语音,具有高质量的运动捕获,以及音频和视频录制。 (c)2020 elestvier有限公司保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号