【24h】

Personalizing Human Video Pose Estimation

机译:个性化人类视频姿势估计

获取原文

摘要

We propose a personalized ConvNet pose estimator that automatically adapts itself to the uniqueness of a person's appearance to improve pose estimation in long videos. We make the following contributions: (i) we show that given a few high-precision pose annotations, e.g. from a generic ConvNet pose estimator, additional annotations can be generated throughout the video using a combination of image-based matching for temporally distant frames, and dense optical flow for temporally local frames, (ii) we develop an occlusion aware self-evaluation model that is able to automatically select the high-quality and reject the erroneous additional annotations, and (iii) we demonstrate that these high-quality annotations can be used to fine-tune a ConvNet pose estimator and thereby personalize it to lock on to key discriminative features of the person's appearance. The outcome is a substantial improvement in the pose estimates for the target video using the personalized ConvNet compared to the original generic ConvNet. Our method outperforms the state of the art (including top ConvNet methods) by a large margin on three standard benchmarks, as well as on a new challenging YouTube video dataset. Furthermore, we show that training from the automatically generated annotations can be used to improve the performance of a generic ConvNet on other benchmarks.
机译:我们提出了一种个性化的ConvNet姿势估计器,它可以自动适应人的外观的独特性,从而改善长视频中的姿势估计。我们做出以下贡献:(i)我们给出了一些高精度的姿势注释,例如通过使用通用的ConvNet姿态估计器,可以结合使用基于图像的时间间隔帧匹配和时间流局部帧密集光流,在整个视频中生成附加注释,(ii)我们开发了一种可识别遮挡的自我评估模型,能够自动选择高质量并拒绝错误的附加注释,并且(iii)我们证明了这些高质量注释可用于微调ConvNet姿态估计器,从而使其个性化以锁定关键的区别特征人的外表。结果是,与原始通用ConvNet相比,使用个性化ConvNet可以大大改善目标视频的姿态估计。我们的方法在三个标准基准测试以及新的具有挑战性的YouTube视频数据集上,远远超过了最新技术(包括顶级ConvNet方法)。此外,我们展示了自动生成的注释中的训练可用于提高通用ConvNet在其他基准上的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号