Deep video code for efficient face video retrieval

Qiao Shishi; Wang Ruiping; Shan Shiguang; Chen Xilin

首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Deep video code for efficient face video retrieval

【24h】

Deep video code for efficient face video retrieval

机译：高效脸部视频检索的深视频代码

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we address one specific video retrieval problem in terms of human face. Given one query in forms of either a frame or a sequence from a person, we search the database and return the most relevant face videos, i.e., ones have the same class label with the query. Such problem is very challenging due to the large intra-class variations and the high request on the efficiency of video representations in terms of both time and space. To handle such challenges, this paper proposes a novel Deep Video Code (DVC) method which encodes video faces into compact binary codes. Specifically, we devise an end-to end convolutional neural network (CNN) framework that takes face videos as training inputs, models each of them as a unified representation by temporal feature pooling operation, and finally projects the high dimensional representations of both frames and videos into Hamming space to generate binary codes. In such Hamming space, distance of dissimilar pairs is larger than that of similar pairs by a margin. To this end, a novel bounded triplet hashing loss is elaborately designed, which takes all dissimilar pairs into consideration for each anchor point in a mini-batch, and the optimization of the loss function is smoother and more stable. Extensive experiments on challenging video face databases and general image/video datasets with comparison to the state-of-the-arts verify the effectiveness of our method in different kinds of retrieval scenarios. (c) 2020 Elsevier Ltd. All rights reserved.

机译：在本文中，我们从人脸的角度来解决一个特定的视频检索问题。给定一个人以帧或序列形式进行的查询，我们搜索数据库并返回最相关的人脸视频，即与查询具有相同类别标签的视频。由于类内变化较大，对视频表示在时间和空间上的效率要求较高，这类问题非常具有挑战性。为了应对这些挑战，本文提出了一种新的深度视频编码（DVC）方法，将视频人脸编码成紧凑的二进制码。具体来说，我们设计了一个端到端卷积神经网络（CNN）框架，该框架以人脸视频作为训练输入，通过时间特征池操作将每个人脸视频建模为一个统一的表示，最后将帧和视频的高维表示投影到汉明空间以生成二进制代码。在这样的汉明空间中，不同对的距离比相似对的距离大一个边距。为此，我们精心设计了一种新的有界三重态散列损失函数，它考虑了小批量中每个锚点的所有不同对，并且损失函数的优化更加平滑和稳定。在具有挑战性的视频人脸数据库和一般图像/视频数据集上进行了大量实验，并与现有技术进行了比较，验证了我们的方法在不同检索场景中的有效性。（c） 2020爱思唯尔有限公司版权所有。

著录项

来源
《Pattern Recognition: The Journal of the Pattern Recognition Society》 |2021年第1期|共11页
作者
Qiao Shishi; Wang Ruiping; Shan Shiguang; Chen Xilin;
展开▼
作者单位

Chinese Acad Sci Inst Comp Technol Key Lab Intelligent Informat Proc Beijing 100190 Peoples R China;

Chinese Acad Sci Inst Comp Technol Key Lab Intelligent Informat Proc Beijing 100190 Peoples R China;

Chinese Acad Sci Inst Comp Technol Key Lab Intelligent Informat Proc Beijing 100190 Peoples R China;

Chinese Acad Sci Inst Comp Technol Key Lab Intelligent Informat Proc Beijing 100190 Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Face video retrieval; Temporal feature pooling; Bounded triplet loss; Deep video code; Hash learning;

机译：面部视频检索;时间特征汇总;有界三态损耗;深度视频代码;哈希学习;

相似文献

外文文献
中文文献
专利

1. Efficient Coding Unit and Prediction Unit Decision Algorithm for Multiview Video Coding [J] . Wei-Hsiang Chang, Mei-Juan Chen, Gwo-Long Li, 电子科技学刊：英文版 . 2015,第002期
2. Efficient fast mode decision using mode complexity for multi-view video coding [J] . WANG Feng-sui, SHEN Qing-hong, DU Si-dan 中南大学学报（英文版） . 2014,第011期
3. Unsupervised Deep Video Hashing via Balanced Code for Large-Scale Video Retrieval [J] . Gengshen Wu, Jungong Han, Yuchen Guo, IEEE Transactions on Image Processing . 2019,第4期

机译：通过平衡代码进行无监督的深度视频哈希处理，以进行大规模视频检索
4. Efficient embedding and retrieval of information for high-resolution videos coded with HEVC [J] . Computers and Electrical Engineering . 2020,第期

机译：高分辨率视频与HEVC编码的高分辨率视频信息的有效嵌入和检索
5. Hierarchical video indexing and retrieval for subband-coded video [J] . Lee J., Dickinson B.W. IEEE Transactions on Circuits and Systems for Video Technology . 2000,第5期

机译：子带编码视频的分层视频索引和检索
6. Deep Video Code for Efficient Face Video Retrieval [C] . Shishi Qiao, Ruiping Wang, Shiguang Shan, Asian conference on computer vision . 2017

机译：高效的人脸视频检索的深层视频代码
7. Efficient multi-view video coding scheme based on dynamic video object segmentation. [D] . Wei, Xiaohui. 2007

机译：基于动态视频对象分割的高效多视图视频编码方案。
8. The Simple Video Coder: A free tool for efficiently coding social video data [O] . Daniel Barto, Clark W. Bird, Derek A. Hamilton, -1

机译：Simple Video Coder：一个免费工具可以有效地编码社交视频数据
9. Unsupervised Deep Video Hashing via Balanced Code for Large-Scale Video Retrieval [O] . Gengshen Wu, Jungong Han, Yuchen Guo, 2019

机译：通过平衡码的大规模视频检索的概要码漫长的深度视频散列
10. Efficient Coding of Video Images. [R] . Staelin, D. H. 1984

机译：视频图像的高效编码。

Deep video code for efficient face video retrieval

摘要

著录项

相似文献

相关主题

期刊订阅