...
首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Deep video code for efficient face video retrieval
【24h】

Deep video code for efficient face video retrieval

机译:高效脸部视频检索的深视频代码

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this paper, we address one specific video retrieval problem in terms of human face. Given one query in forms of either a frame or a sequence from a person, we search the database and return the most relevant face videos, i.e., ones have the same class label with the query. Such problem is very challenging due to the large intra-class variations and the high request on the efficiency of video representations in terms of both time and space. To handle such challenges, this paper proposes a novel Deep Video Code (DVC) method which encodes video faces into compact binary codes. Specifically, we devise an end-to end convolutional neural network (CNN) framework that takes face videos as training inputs, models each of them as a unified representation by temporal feature pooling operation, and finally projects the high dimensional representations of both frames and videos into Hamming space to generate binary codes. In such Hamming space, distance of dissimilar pairs is larger than that of similar pairs by a margin. To this end, a novel bounded triplet hashing loss is elaborately designed, which takes all dissimilar pairs into consideration for each anchor point in a mini-batch, and the optimization of the loss function is smoother and more stable. Extensive experiments on challenging video face databases and general image/video datasets with comparison to the state-of-the-arts verify the effectiveness of our method in different kinds of retrieval scenarios. (c) 2020 Elsevier Ltd. All rights reserved.
机译:在本文中,我们从人脸的角度来解决一个特定的视频检索问题。给定一个人以帧或序列形式进行的查询,我们搜索数据库并返回最相关的人脸视频,即与查询具有相同类别标签的视频。由于类内变化较大,对视频表示在时间和空间上的效率要求较高,这类问题非常具有挑战性。为了应对这些挑战,本文提出了一种新的深度视频编码(DVC)方法,将视频人脸编码成紧凑的二进制码。具体来说,我们设计了一个端到端卷积神经网络(CNN)框架,该框架以人脸视频作为训练输入,通过时间特征池操作将每个人脸视频建模为一个统一的表示,最后将帧和视频的高维表示投影到汉明空间以生成二进制代码。在这样的汉明空间中,不同对的距离比相似对的距离大一个边距。为此,我们精心设计了一种新的有界三重态散列损失函数,它考虑了小批量中每个锚点的所有不同对,并且损失函数的优化更加平滑和稳定。在具有挑战性的视频人脸数据库和一般图像/视频数据集上进行了大量实验,并与现有技术进行了比较,验证了我们的方法在不同检索场景中的有效性。(c) 2020爱思唯尔有限公司版权所有。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号