Image and Sentence Matching via Semantic Concepts and Order Learning

首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >Image and Sentence Matching via Semantic Concepts and Order Learning

【24h】

Image and Sentence Matching via Semantic Concepts and Order Learning

机译：通过语义概念和顺序学习进行图像和句子匹配

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Image and sentence matching has made great progress recently, but it remains challenging due to the existing large visual-semantic discrepancy. This mainly arises from two aspects: 1) images consist of unstructured content which is not semantically abstract as the words in the sentences, so they are not directly comparable, and 2) arranging semantic concepts in different semantic order could lead to quite diverse meanings. The words in the sentences are sequentially arranged in a grammatical manner, while the semantic concepts in the images are usually unorganized. In this work, we propose a semantic concepts and order learning framework for image and sentence matching, which can improve the image representation by first predicting semantic concepts and then organizing them in a correct semantic order. Given an image, we first use a multi-regional multi-label CNN to predict its included semantic concepts in terms of object, property and action. These word-level semantic concepts are directly comparable with the words of noun, adjective and verb in the matched sentence. Then, to organize these concepts and make them express similar meanings as the matched sentence, we use a context-modulated attentional LSTM to learn the semantic order. It regards the predicted semantic concepts and image global scene as context at each timestep, and selectively attends to concept-related image regions by referring to the context in a sequential order. To further enhance the semantic order, we perform additional sentence generation on the image representation, by using the groundtruth order in the matched sentence as supervision. After obtaining the improved image representation, we learn the sentence representation with a conventional LSTM, and then jointly perform image and sentence matching and sentence generation for model learning. Extensive experiments demonstrate the effectiveness of our learned semantic concepts and order, by achieving the state-of-the-art results on two public benchmark datasets.

机译：图像和句子匹配最近取得了长足的进步，但是由于存在巨大的视觉语义差异，因此仍然具有挑战性。这主要来自两个方面：1）图像由非结构化内容组成，这些内容在语义上不像句子中的单词那样抽象，因此它们不能直接比较; 2）以不同语义顺序排列语义概念可能会导致含义相当多样化。句子中的单词以语法方式顺序排列，而图像中的语义概念通常是没有组织的。在这项工作中，我们提出了一种用于图像和句子匹配的语义概念和顺序学习框架，该框架可以通过先预测语义概念然后以正确的语义顺序组织它们来改善图像表示。给定图像，我们首先使用多区域多标签CNN从对象，属性和动作方面预测其包含的语义概念。这些词级语义概念可直接与匹配句子中的名词，形容词和动词词相提并论。然后，为了组织这些概念并使它们表达与匹配句子相似的含义，我们使用上下文调制的注意力LSTM来学习语义顺序。它在每个时间步均将预测的语义概念和图像全局场景视为上下文，并通过按顺序引用上下文来有选择地关注与概念相关的图像区域。为了进一步增强语义顺序，我们使用匹配句子中的地面顺序作为监督对图像表示执行额外的句子生成。在获得改进的图像表示之后，我们使用常规的LSTM学习句子表示，然后联合执行图像和句子匹配以及用于模型学习的句子生成。通过在两个公共基准数据集上获得最新的结果，大量的实验证明了我们学到的语义概念和顺序的有效性。

著录项

来源
《IEEE Transactions on Pattern Analysis and Machine Intelligence》 |2020年第3期|636-650|共15页
作者

展开▼
作者单位

Chinese Acad Sci CASIA CRIPAC Inst Automat NLPR Beijing 100864 Peoples R China;

Univ Adelaide ACRV Adelaide SA 5005 Australia;

Chinese Acad Sci CASIA Inst Automat NLPR CRIPAC CEBSIT Beijing 100864 Peoples R China|Univ Chinese Acad Sci Beijing 100049 Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Semantics; Image representation; Task analysis; Context modeling; Logic gates; Pattern matching; Image annotation; Semantic concept; semantic order; context-modulated attention; image and sentence matching;

机译：语义学图像表示;任务分析;上下文建模;逻辑门模式匹配;图像批注;语义概念;语义顺序情境调节的注意力;图像和句子匹配;

相似文献

外文文献
中文文献
专利

1. Learning Semantic Concepts from Noisy Media Collection for Automatic Image Annotation [J] . Feng Tian, Xukun Shen Chinese Journal of Electronics . 2015,第4期

机译：从嘈杂的媒体库中学习语义概念以进行自动图像注释
2. Massive-scale learning of image and video semantic concepts [J] . Smith J.R., Cao L., Codella N.C.F., IBM Journal of Research and Development . 2015,第2a3期

机译：图像和视频语义概念的大规模学习
3. Learning Semantic Concepts from Noisy Media Collection for Automatic Image Annotation [J] . TIAN Feng, SHEN Xukun 电子学报（英文版） . 2015,第004期

机译：从嘈杂的媒体库中学习语义概念以进行自动图像注释
4. Learning Semantic Concepts and Order for Image and Sentence Matching [C] . Yan Huang, Qi Wu, Chunfeng Song, IEEE/CVF Conference on Computer Vision and Pattern Recognition . 2018

机译：学习语义概念和图像和句子匹配的顺序
5. Semantics-sensitive integrated matching for picture libraries and biomedical image databases. [D] . Wang, James Ze. 2000

机译：图片库和生物医学图像数据库的语义敏感集成匹配。
6. A Hybrid Normalization Method for Medical Concepts in Clinical Narrative using Semantic Matching [O] . Yen-Fu Luo, Weiyi Sun, Anna Rumshisky 2019

机译：基于语义匹配的临床叙事医学概念混合归一化方法
7. Learning Semantic Concepts and Order for Image and Sentence Matching [O] . Yan Huang, Qi Wu, Chunfeng Song, 2018

机译：学习语义概念和图像和句子匹配的顺序

Image and Sentence Matching via Semantic Concepts and Order Learning

摘要

著录项

相似文献

相关主题

期刊订阅