首页> 外文期刊>Computer vision and image understanding >A support vector approach for cross-modal search of images and texts
【24h】

A support vector approach for cross-modal search of images and texts

机译:一种支持向量方法,用于图像和文本的跨模式搜索

获取原文
获取原文并翻译 | 示例

摘要

Building bilateral semantic associations between images and texts is among the fundamental problems in computer vision. In this paper, we study two complementary cross-modal prediction tasks: (i) predicting text(s) given a query image ("Im2Text"), and (ii) predicting image(s) given a piece of text ("Text21m"). We make no assumption on the specific form of text; i.e., it could be either a set of labels, phrases, or even captions. We pose both these tasks in a retrieval framework. For Im2Text, given a query image, our goal is to retrieve a ranked list of semantically relevant texts from an independent text-corpus (i.e., texts with no corresponding images). Similarly, for Text2Im, given a query text, we aim to retrieve a ranked list of semantically relevant images from a collection of unannotated images (i.e., images without any associated textual meta-data). We propose a novel Structural SVM based unified framework for these two tasks, and show how it can be efficiently trained and tested. Using a variety of loss functions, extensive experiments are conducted on three popular datasets (two medium-scale datasets containing few thousands of samples, and one web-scale dataset containing one million samples). Experiments demonstrate that our framework gives promising results compared to competing baseline cross-modal search techniques, thus confirming its efficacy.
机译:在图像和文本之间建立双边语义关联是计算机视觉的基本问题之一。在本文中,我们研究了两个互补的交叉模式预测任务:(i)预测具有查询图像(“ Im2Text”)的文本,和(ii)预测具有一段文本(“ Text21m”的图像) )。我们对文本的具体形式不做任何假设;也就是说,它可以是一组标签,词组甚至标题。我们将这两个任务放在一个检索框架中。对于Im2Text,给定查询图像,我们的目标是从独立的文本语料库(即没有相应图像的文本)中检索语义相关文本的排名列表。类似地,对于Text2Im,给定查询文本,我们旨在从未注释的图像(即没有任何关联文本元数据的图像)的集合中检索语义相关图像的排名列表。我们针对这两个任务提出了一个新颖的基于Structure SVM的统一框架,并展示了如何对其进行有效的培训和测试。使用各种损失函数,对三个流行的数据集(两个包含数千个样本的中规模数据集,以及一个包含一百万个样本的网络规模数据集)进行了广泛的实验。实验证明,与竞争性基线交叉模式搜索技术相比,我们的框架可提供令人鼓舞的结果,从而证实了其有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号