...
首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Actor-independent action search using spatiotemporal vocabulary with appearance hashing
【24h】

Actor-independent action search using spatiotemporal vocabulary with appearance hashing

机译:使用时空词汇和外表哈希的演员无关动作搜索

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Human actions in movies and sitcoms usually capture semantic cues for story understanding, which offer a novel search pattern beyond the traditional video search scenario. However, there are great challenges to achieve action-level video search, such as global motions, concurrent actions, and actor appearance variances. In this paper, we introduce a generalized action retrieval framework, which achieves fully unsupervised, robust, and actor-independent action search in large-scale database. First, an Attention Shift model is presented to extract human-focused foreground actions from videos containing global motions or concurrent actions. Subsequently, a spatiotemporal vocabulary is built based on 3D-SIFT features extracted from these human-focused action regions. These 3D-SIFT features offer robustness against rotations and viewpoints. And the spatiotemporal vocabulary guarantees our search efficiency, which is achieved by inverted indexing structure with approximate nearest-neighbor search. In the online ranking, we employ dynamic time warping distance to handle the action duration variances, as well as partial action matching. Finally, an appearance hashing strategy is presented to address the performance degeneration caused by divergent actor appearances. For experimental validation, we have deployed actor-independent action retrieval framework in 3-season "Friends" sitcoms (over 30 h). In this database, we have reported the best performance (MAP@1>0.53) with comparisons to alternative and state-of-the-art approaches.
机译:电影和情景喜剧中的人为行为通常会捕获语义线索以实现故事理解,这提供了超越传统视频搜索场景的新颖搜索模式。但是,实现动作级视频搜索存在巨大挑战,例如全局动作,并发动作和演员外观差异。在本文中,我们介绍了一种通用的动作检索框架,该框架可在大型数据库中实现完全不受监督,健壮且与参与者无关的动作搜索。首先,提出一种注意力转移模型,以从包含全局动作或并发动作的视频中提取以人为中心的前景动作。随后,基于从这些以人为中心的动作区域中提取的3D-SIFT特征建立时空词汇。这些3D-SIFT功能可抵抗旋转和视点。而时空词汇保证了我们的搜索效率,这是通过具有近似最近邻搜索的倒排索引结构实现的。在在线排名中,我们采用动态时间规整距离来处理动作持续时间变化以及部分动作匹配。最后,提出了一种外观散列策略,以解决由不同演员外观引起的性能退化。为了进行实验验证,我们在3季的“ Friends”情景喜剧中部署了与演员无关的动作检索框架(超过30小时)。在此数据库中,我们报告了最佳性能(MAP @ 1> 0.53),并与替代方法和最新方法进行了比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号