首页> 外文期刊>Journal of visual communication & image representation >Multimedia retrieval by deep hashing with multilevel similarity learning
【24h】

Multimedia retrieval by deep hashing with multilevel similarity learning

机译:通过深度哈希与多层次相似性学习进行多媒体检索

获取原文
获取原文并翻译 | 示例

摘要

Deep multimodal hashing has received increasing research attention in recent years due to its superior performance for large-scale multimedia retrieval. However, limited e orts have been made to explore the complex multilevel semantic structure for deep multimodal hashing. In this paper, we propose a novel deep multimodal hashing method, termed as Deep Hashing with Multilevel Similarity Learning (DHMSL), for learning compact and discriminative hash codes, which explores multilevel semantic similarity correlations of multimedia data. In DHMSL, multilevel similarity correlation is explored to learn the unified binary hash codes by exploiting the local structure and semantic label information simultaneously. Meanwhile, the bit balance and quantization constraints are taken into account to further make the unified hash codes compact. With the unified binary codes learned, two deep neural networks are jointly trained to simultaneously learn feature representations and two sets of nonlinear hash functions. Specifically, the well-designed loss functions are introduced to minimize the prediction errors of the feature representations as well as the errors between the unified binary codes and outputs of the networks. Extensive experiments on two widely-used multimodal datasets demonstrate that the proposed method can achieve the state-of-the-art performance for both image-query-text and text-query-image tasks. (C) 2019 Elsevier Inc. All rights reserved.
机译:深度多模式散列由于其在大规模多媒体检索中的优越性能,近年来受到越来越多的研究关注。但是,为探索用于深度多模式散列的复杂多级语义结构所做的有限努力。在本文中,我们提出了一种新颖的深度多模式散列方法,称为“具有多级相似性学习的深度散列”(DHMSL),用于学习紧凑型和区分性散列码,该方法探索了多媒体数据的多级语义相似性相关性。在DHMSL中,通过同时利用局部结构和语义标签信息,探索了多级相似性相关性以学习统一的二进制哈希码。同时,考虑了比特平衡和量化约束,以进一步使统一哈希码紧凑。通过学习统一的二进制代码,可以共同训练两个深度神经网络,以同时学习特征表示和两组非线性哈希函数。具体来说,引入精心设计的损失函数以最大程度地减少特征表示的预测误差以及统一二进制代码与网络输出之间的误差。在两个广泛使用的多峰数据集上进行的大量实验表明,该方法可以同时实现图像查询文本和文本查询图像任务的最新性能。 (C)2019 Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号