首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Binary Set Embedding for Cross-Modal Retrieval
【24h】

Binary Set Embedding for Cross-Modal Retrieval

机译:跨模态检索的二进制集嵌入

获取原文
获取原文并翻译 | 示例

摘要

Cross-modal retrieval is such a challenging topic that traditional global representations would fail to bridge the semantic gap between images and texts to a satisfactory level. Using local features from images and words from documents directly can be more robust for the scenario with large intraclass variations and small interclass discrepancies. In this paper, we propose a novel unsupervised binary coding algorithm called binary set embedding (BSE) to obtain meaningful hash codes for local features from the image domain and words from text domain. Understanding image features with the word vectors learned from the human language instead of the provided documents from data sets, BSE can map samples into a common Hamming space effectively and efficiently where each sample is represented by the sets of local feature descriptors from image and text domains. In particular, BSE explores relationship among local features in both feature level and image (text) level, which can balance the sensitivity of each other. Furthermore, a recursive orthogonalization procedure is applied to reduce the redundancy of codes. Extensive experiments demonstrate the superior performance of BSE compared with state-of-the-art cross-modal hashing methods using either image or text queries.
机译:跨模式检索是一个具有挑战性的主题,以至于传统的全局表示方法无法将图像和文本之间的语义鸿沟缩小到令人满意的水平。直接使用图像中的局部特征和文档中的单词对于具有较大的类内差异和较小的类间差异的方案可能会更可靠。在本文中,我们提出了一种新颖的无监督二进制编码算法,称为二进制集嵌入(BSE),以从图像域和文本域中的单词中获取有意义的哈希码。 BSE可以从人类语言中学习单词向量,而不是从数据集中提供的文档中了解图像特征,因此可以有效地将样本映射到公共汉明空间中,其中每个样本都由来自图像和文本域的局部特征描述符集来表示。特别是,BSE在功能级别和图像(文本)级别上探索了局部功能之间的关系,这可以平衡彼此的敏感性。此外,递归正交化过程被应用以减少代码的冗余。广泛的实验表明,与使用图像或文本查询的最新交叉模式哈希方法相比,BSE的性能更高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号