首页> 外文期刊>Neurocomputing >Image-text bidirectional learning network based cross-modal retrieval
【24h】

Image-text bidirectional learning network based cross-modal retrieval

机译:Image-text bidirectional learning network based cross-modal retrieval

获取原文
获取原文并翻译 | 示例
           

摘要

The problem of cross-modal retrieval has attracted significant attention in the cross-media retrieval community. One key challenge of cross-modal retrieval is to eliminate the heterogeneous gap between different patterns. The existing numerous cross-modal retrieval approaches tend to jointly construct a common subspace, while these methods fail to consider mutual influence between modalities sufficiently during the whole training process. In this paper, we propose a novel image-text Bidirectional Learning Network (BLN) based cross-modal retrieval method. The method constructs a common representation space and directly measures the similarity of heterogeneous data. More specifically, a multi-layer supervision network is proposed to learn the cross-modal relevance of the generated representations. Moreover, a bidirectional crisscross loss function is proposed to preserve the modal invariance with the bidirectional learning strategy in the common representation space. The loss functions of discriminant consistency and the bidirectional crisscross loss are integrated into an objective function which aims to minimize the intra-class distance and maximize the inter-class distance. Comprehensive experimental results on four widely-used databases show that the proposed method is effective and superior to the existing cross-modal retrieval methods. (c) 2022 Elsevier B.V. All rights reserved.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号