首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Cross-modality matching based on Fisher Vector with neural word embeddings and deep image features
【24h】

Cross-modality matching based on Fisher Vector with neural word embeddings and deep image features

机译:基于Fisher向量的跨模型匹配与神经词嵌入和深图象特征

获取原文

摘要

Cross-modal retrieval, which aims to solve the problem that the query and the retrieved results are from different modality, becomes more and more essential with the development of the Internet. In this paper, we mainly focus on the exploration of high-level semantic representation of image and text for cross-modal matching. Deep convolutional image features and Fisher Vector with neural word embeddings are utilized as visual and textual features respectively. To further investigate the correlation among heterogeneous multimodal characteristics, we use multiclass logistic classifier for semantic matching across modalities. Experiments on Wikipedia and Pascal Sentence dataset demonstrate the robustness and effectiveness for both Img2Text and Text2Img retrieval tasks.
机译:跨模式检索,旨在解决查询和检索结果来自不同模式的问题,随着互联网的发展变得越来越重要。在本文中,我们主要关注跨模型匹配的图像和文本的高级语义表示的探索。具有神经单词嵌入的深度卷积图像特征和Fisher向量分别用作视觉和文本特征。为了进一步调查异构多模式特征之间的相关性,我们将多种数组逻辑分类器用于跨模式的语义匹配。维基百科和Pascal句子数据集的实验展示了IMG2Text和Text2img检索任务的稳健性和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号