首页> 外文会议>IAPR Asian Conference on Pattern Recognition >Sequence-to-Sequence Learning for Human Pose Correction in Videos
【24h】

Sequence-to-Sequence Learning for Human Pose Correction in Videos

机译:视频中人体姿势校正的序列到序列学习

获取原文

摘要

The power of ConvNets has been demonstrated in a wide variety of vision tasks including pose estimation. But they often produce absurdly erroneous predictions in videos due to unusual poses, challenging illumination, blur, self-occlusions etc. These erroneous predictions can be refined by leveraging previous and future predictions as the temporal smoothness constrain in the videos. In this paper, we present a generic approach for pose correction in videos using sequence learning that makes minimal assumptions on the sequence structure. The proposed model is generic, fast and surpasses the state-of-the-art on benchmark datasets. We use a generic pose estimator for initial pose estimates, which are further refined using our method. The proposed architecture uses Long Short-Term Memory (LSTM) encoder-decoder model to encode the temporal context and refine the estimations. We show 3.7% gain over the baseline Yang & Ramanan (YR) and 2.07% gain over Spatial Fusion Network (SFN) on a new challenging YouTube Pose Subset dataset.
机译:ConvNets的功能已在包括姿势估计在内的各种视觉任务中得到了证明。但是由于异常的姿势,具有挑战性的照明,模糊,自遮挡等原因,它们通常会在视频中产生荒谬的错误预测。这些错误的预测可以通过利用先前和将来的预测来完善,因为视频中的时间平滑度受到限制。在本文中,我们介绍了一种使用序列学习对视频进行姿势校正的通用方法,该方法对序列结构进行了最小假设。所提出的模型通用,快速并且超越了基准数据集的最新技术。我们将通用姿态估计器用于初始姿态估计,然后使用我们的方法进一步完善。所提出的体系结构使用长短期存储器(LSTM)编码器/解码器模型对时间上下文进行编码并完善估计。在新的具有挑战性的YouTube姿势子集数据集上,我们显示出比基线Yang&Ramanan(YR)增长了3.7%,比Spatial Fusion Network(SFN)增长了2.07%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号