Draw and Tell: Multimodal Descriptions Outperform Verbal- or Sketch-Only Descriptions in an Image Retrieval Task

机译：绘制并告诉：多模式描述越优于图像检索任务中的言语或素描仅素描

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

While language conveys meaning largely symbolically, actual communication acts typically contain iconic elements as well: People gesture while they speak, or may even draw sketches while explaining something. Image retrieval prima facie seems like a task that could profit from combined symbolic and iconic reference, but it is typically set up to work either from language only, or via (iconic) sketches with no verbal contribution. Using a model of grounded language semantics and a model of sketch-to-image mapping, we show that adding even very reduced iconic information to a verbal image description improves recall. Verbal descriptions paired with fully detailed sketches still perform better than these sketches alone. We see these results as supporting the assumption that natural user interfaces should respond to multimodal input, where possible, rather than just language alone.

机译：虽然语言在很大程度上传达了意义，但实际的通信行为通常也包含标志性元素：人们在说话时甚至可以在解释某些东西时绘制草图。图像检索prima面部似乎是一个可以从组合符号和标志性的参考中获利的任务，但通常设置为仅从语言或通过（标志性）草图的工作，没有口头贡献。使用基础语言语言语言语言的模型和素描到图像映射模型，我们表明将甚至非常减少的标志性信息添加到口头图像描述中提高了召回。用完全详细的草图配对的口头描述仍然比单独的草图更好。我们将这些结果视为支持自然用户界面应该在可能的情况下响应多模式输入的假设，而不是仅仅是语言。

著录项

来源
《International joint conference on natural language processing》|2017年|xxxi 453 p.|共5页
会议地点
作者
Ting Han; David Schlangen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. A Description of Content Based Image Retrieval using from Block Truncation Coding and Image Content Description [J] . D. Venkatakrishna, B. Ankayarkanni Indian Journal of Science and Technology . 2016,第21期

机译：使用来自块截断编码和图像内容描述的基于内容的图像检索描述
2. A Description of Content Based Image Retrieval using from Block Truncation Coding and Image Content Description [J] . D. Venkatakrishna, B. Ankayarkanni Indian Journal of Science and Technology . 2016,第21期

机译：使用来自块截断编码和图像内容描述的基于内容的图像检索描述
3. Determining similarity in histological images using graph-theoretic description and matching methods for content-based image retrieval in medical diagnostics [J] . Harshita Sharma, Alexander Alekseychuk, Peter Leskovsky, Diagnostic pathology . 2012,第4期

机译：使用图论描述和匹配方法在医学诊断中基于内容的图像检索中确定组织学图像的相似性
4. Draw and Tell: Multimodal Descriptions Outperform Verbal- or Sketch-Only Descriptions in an Image Retrieval Task [C] . Ting Han, David Schlangen International joint conference on natural language processing . 2017

机译：绘图并讲述：在图像检索任务中，多模式描述胜过仅口头或草图描述
5. Robust algorithms for mixture decomposition with application to classification, boundary description, and image retrieval. [D] . Medasani, Swarup Shanti. 1998

机译：用于混合物分解的稳健算法，应用于分类，边界描述和图像检索。
6. Determining similarity in histological images using graph-theoretic description and matching methods for content-based image retrieval in medical diagnostics [O] . Harshita Sharma, Alexander Alekseychuk, Peter Leskovsky, 2012

机译：使用图论描述和匹配方法在医学诊断中基于内容的图像检索中确定组织学图像的相似性
7. Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description [O] . Elliott, Desmond, Frank, Stella, Barrault, Loïc, 2017

机译：多模态机器翻译和多语言图像描述的第二个共同任务的发现
8. Program of Hanford High-Level Waste Retrieval Task: A Narrative Description. [R] . wallskog, h. a. 1977

机译：汉福德高级废物检索任务项目：叙事描述。

Draw and Tell: Multimodal Descriptions Outperform Verbal- or Sketch-Only Descriptions in an Image Retrieval Task

摘要

著录项

相似文献

相关主题

期刊订阅