首页> 外文期刊>IEEE Transactions on Image Processing >Collective Reconstructive Embeddings for Cross-Modal Hashing
【24h】

Collective Reconstructive Embeddings for Cross-Modal Hashing

机译:跨模态哈希的集体重构嵌入

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we study the problem of cross-modal retrieval by hashing-based approximate nearest neighbor search techniques. Most existing cross-modal hashing works mainly address the issue of multi-modal integration complexity using the same mapping and similarity calculation for data from different media types. Nonetheless, this may cause information loss during the mapping process due to overlooking the specifics of each individual modality. In this paper, we propose a simple yet effective cross-modal hashing approach, termed collective reconstructive embeddings (CRE), which can simultaneously solve the heterogeneity and integration complexity of multi-modal data. To address the heterogeneity challenge, we propose to process heterogeneous types of data using different modality-specific models. Specifically, we model textual data with cosine similarity-based reconstructive embedding to alleviate the data sparsity to the greatest extent, while for image data, we utilize the Euclidean distance to characterize the relationships of the projected hash codes. Meanwhile, we unify the projections of text and image to the Hamming space into a common reconstructive embedding through rigid mathematical reformulation, which not only reduces the optimization complexity significantly but also facilitates the inter-modal similarity preservation among different modalities. We further incorporate the code balance and uncorrelation criteria into the problem and devise an efficient iterative algorithm for optimization. Comprehensive experiments on four widely used multimodal benchmarks show that the proposed CRE can achieve a superior performance compared with the state of the art on several challenging cross-modal tasks.
机译:在本文中,我们研究了基于基于哈希的近似最近邻搜索技术的跨模式检索问题。大多数现有的跨模式哈希工作主要是针对来自不同媒体类型的数据使用相同的映射和相似度计算来解决多模式集成的复杂性问题。但是,由于忽略了每个单独模态的细节,这可能会在映射过程中导致信息丢失。在本文中,我们提出了一种简单而有效的跨模式散列方法,称为集体重构嵌入(CRE),它可以同时解决多模式数据的异构性和集成复杂性。为了解决异构性挑战,我们建议使用不同的特定于模式的模型来处理异构类型的数据。具体来说,我们使用基于余弦相似度的重构性嵌入对文本数据进行建模,以最大程度地减轻数据稀疏性,而对于图像数据,则利用欧几里得距离来表征投影哈希码的关系。同时,我们通过严格的数学重构将文本和图像对汉明空间的投影统一到一个常见的重构嵌入中,这不仅显着降低了优化复杂度,而且还促进了不同模态之间的模态间相似性保存。我们进一步将代码平衡和不相关准则纳入问题,并设计出有效的迭代算法进行优化。在四个广泛使用的多峰基准测试上的综合实验表明,相比于现有的几种复杂的多峰任务,CRE可以实现更高的性能。

著录项

  • 来源
    《IEEE Transactions on Image Processing》 |2019年第6期|2770-2784|共15页
  • 作者单位

    Univ Elect Sci & Technol China, Ctr Future Media, Chengdu 611731, Sichuan, Peoples R China|Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China;

    Univ Elect Sci & Technol China, Ctr Future Media, Chengdu 611731, Sichuan, Peoples R China|Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China;

    Univ Elect Sci & Technol China, Ctr Future Media, Chengdu 611731, Sichuan, Peoples R China|Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China;

    Univ Elect Sci & Technol China, Ctr Future Media, Chengdu 611731, Sichuan, Peoples R China|Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China;

    Hefei Univ Technol, Sch Comp & Informat, Hefei 230009, Anhui, Peoples R China;

    Univ Elect Sci & Technol China, Ctr Future Media, Chengdu 611731, Sichuan, Peoples R China|Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Cross-modal hashing; reconstructive embeddings; cross-modal retrieval;

    机译:跨模态散列;重建嵌入式;跨模型检索;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号