首页> 外文期刊>Multimedia Tools and Applications >Semantic image retrieval for complex queries using a knowledge parser
【24h】

Semantic image retrieval for complex queries using a knowledge parser

机译:使用知识解析器对复杂查询进行语义图像检索

获取原文
获取原文并翻译 | 示例

摘要

In order to improve the retrieval accuracy of image retrieval systems, research focus has been shifted from designing sophisticated low-level feature extraction algorithms to combining image retrieval processing with rich semantics and knowledge-based methods. In this paper, we aim at improving text-based image retrieval for complex natural language queries by using a semantic parser (Knowledge Parser or K-Parser). From text written in natural language, the K-parser extracts a graphical semantic representation of the objects involved, their properties as well as their relations. We analyze both the image textual captions and the natural language queries with the K-parser. As a technical solution, we leverage RDF in two ways: first, we store the parsed image captions as RDF triples; second, we translate image queries into SPARQL queries. When applied to the Flickr8k dataset with a set of 16 custom queries, we notice that the K-parser exhibits some biases that negatively affect the accuracy of the queries. We propose two techniques to address the weaknesses: (1) we introduce a set of rules to transform the output of K-parser and fix some basic, recurrent parsing mistakes that occur on the captions of Flickr8k; (2) we leverage two popular commonsense knowledge databases, ConceptNet and WordNet, to raise the accuracy of queries on broad concepts. Using those two techniques, we can fix most of the initial retrieval errors, and accurately execute our set of 16 queries on the Flickr8k dataset.
机译:为了提高图像检索系统的检索精度,研究重点已从设计复杂的低级特征提取算法转变为将图像检索处理与丰富的语义和基于知识的方法相结合。在本文中,我们旨在通过使用语义解析器(知识解析器或K-Parser)改善复杂自然语言查询的基于文本的图像检索。 K分析器从用自然语言编写的文本中提取所涉及对象,其属性以及它们之间的关系的图形语义表示。我们使用K解析器分析图像的文字字幕和自然语言查询。作为一种技术解决方案,我们通过两种方式利用RDF:首先,将解析的图像标题存储为RDF的三倍。其次,我们将图像查询转换为SPARQL查询。当将其应用于具有16个自定义查询集的Flickr8k数据集时,我们注意到K分析器表现出一些偏差,这些偏差会对查询的准确性产生负面影响。我们提出了两种技术来解决这些缺点:(1)我们引入了一组规则来转换K分析器的输出,并修复Flickr8k字幕上发生的一些基本的,经常性的分析错误; (2)我们利用两个流行的常识知识数据库ConceptNet和WordNet来提高对广泛概念的查询的准确性。使用这两种技术,我们可以修复大多数初始检索错误,并在Flickr8k数据集上准确执行我们的16个查询集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号