首页> 外文期刊>Neurocomputing >Deep canonical correlation analysis with progressive and hypergraph learning for cross-modal retrieval
【24h】

Deep canonical correlation analysis with progressive and hypergraph learning for cross-modal retrieval

机译:具有渐进和超图学习的深度典范相关分析,用于跨模式检索

获取原文
获取原文并翻译 | 示例

摘要

This paper deals with the problem of modeling Internet images and associated texts for cross-modal retrieval such as text-to-image retrieval and image-to-text retrieval. We start with deep canonical correlation analysis (DCCA), a deep approach for mapping text and image pairs into a common latent space. We first propose a novel progressive framework and embed DCCA in it. In our progressive framework, a linear projection loss layer is inserted before the nonlinear hidden layers of a deep network. The training of linear projection and the training of nonlinear layers are combined to ensure that the linear projection is well matched with the nonlinear processing stages and good representations of the input raw data are learned at the output of the network. Then we introduce a hypergraph semantic embedding (HSE) method, which extracts latent semantics from texts, into DCCA to regularize the latent space learned by image view and text view. In addition, a search-based similarity measure is proposed to score relevance of image-text pairs. Based on the above ideas, we propose a model, called DCCA-PHS, for cross-modal retrieval. Experiments on three publicly available data sets show that DCCA-PHS is effective and efficient, and achieves state-of-the-art performance for unsupervised scenario. (C) 2016 Elsevier B.V. All rights reserved.
机译:本文涉及为跨模式检索(例如,文本到图像检索和图像到文本检索)建模Internet图像和相关文本的问题。我们从深度规范相关分析(DCCA)开始,这是一种将文本和图像对映射到公共潜在空间的深度方法。我们首先提出一个新颖的渐进框架并将DCCA嵌入其中。在我们的渐进框架中,将线性投影损失层插入到深层网络的非线性隐藏层之前。线性投影的训练和非线性层的训练相结合,以确保线性投影与非线性处理阶段很好地匹配,并且在网络的输出处学习了输入原始数据的良好表示形式。然后,我们引入了一种超图语义嵌入(HSE)方法,该方法从文本中提取潜在的语义,并将其提取到DCCA中,以规范化通过图像视图和文本视图学习的潜在空间。另外,提出了一种基于搜索的相似性度量来对图像-文本对的相关性进行评分。基于以上想法,我们提出了一种称为DCCA-PHS的模型,用于跨模式检索。在三个可公开获得的数据集上进行的实验表明,DCCA-PHS是有效且高效的,并且可以在无人监督的情况下达到最先进的性能。 (C)2016 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2016年第19期|618-628|共11页
  • 作者单位

    Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing, Peoples R China;

    Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing, Peoples R China|China Univ Petr, Coll Comp & Commun Engn, Qingdao, Peoples R China;

    Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing, Peoples R China|Beijing Key Lab Network Syst & Network Culture, Beijing, Peoples R China;

    Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing, Peoples R China|Beijing Key Lab Network Syst & Network Culture, Beijing, Peoples R China;

    Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing, Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Progressive; Semantic; Hypergraph; Search-based;

    机译:渐进式;语义;超图;基于搜索;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号