Making History Matter: History-Advantage Sequence Training for Visual Dialog

机译：制作历史性问题：Visual-Advantage序列培训进行视觉对话框

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study the multi-round response generation in visual dialog, where a response is generated according to a visually grounded conversational history. Given a triplet: an image, Q&A history, and current question, all the prevailing methods follow a codec (i.e., encoder-decoder) fashion in a supervised learning paradigm: a multimodal encoder encodes the triplet into a feature vector, which is then fed into the decoder for the current answer generation, supervised by the ground-truth. However, this conventional supervised learning does NOT take into account the impact of imperfect history, violating the conversational nature of visual dialog and thus making the codec more inclined to learn history bias but not contextual reasoning. To this end, inspired by the actor-critic policy gradient in reinforcement learning, we propose a novel training paradigm called History Advantage Sequence Training (HAST). Specifically, we intentionally impose wrong answers in the history, obtaining an adverse critic, and see how the historic error impacts the codec’s future behavior by History Advantage — a quantity obtained by subtracting the adverse critic from the gold reward of ground-truth history. Moreover, to make the codec more sensitive to the history, we propose a novel attention network called History-Aware Co-Attention Network (HACAN) which can be effectively trained by using HAST. Experimental results on three benchmarks: VisDial v0.9&v1.0 and GuessWhat?!, show that the proposed HAST strategy consistently outperforms the state-of-the-art supervised counterparts.

机译：我们研究了视觉对话框中的多轮响应生成，其中根据视觉接地的会话历史生成响应。给定三态：图像，问答历史和当前问题，所有现行方法在监督的学习范例中遵循编解码器（即，编码器 - 解码器）时尚：多模式编码器将三联网编码为特征向量，然后送入进入DecodeR的当前答复一代，由地面真理监督。然而，这种传统的监督学习没有考虑到不完美历史的影响，违反了视觉对话的会话性质，从而使编解码器更倾向于学习历史偏见而不是上下文推理。为此，由演员评论家政策梯度在加强学习中启发，我们提出了一种名为历史优势序列训练（Hast）的新型培训范式。具体而言，我们故意在历史上施加错误的答案，获得一个不利的批评者，并了解历史错误如何通过历史优势对Codec的未来行为产生如何 - 通过从地面真理历史的黄金奖励中减去不利批评的数量。此外，为了使编解码器对历史更敏感，我们提出了一种名为历史知识的共同关注网络（HACAN）的新颖关注网络，可以通过使用Hast来有效地训练。三个基准试验结果：viddial v0.9＆v1.0和猜测？！，表明拟议的机械战略始终如一地优于最先进的受监管的同行。

著录项

来源
《International Conference on Computer Vision》|2019年|1 v.|共9页
会议地点
作者
Tianhao Yang; Zheng-Jun Zha; Hanwang Zhang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.41;
关键词
History; Visualization; Training; Codecs; Gold; Task analysis; Decoding;

机译：历史;可视化;培训;编解码器;黄金;任务分析;解码;

相似文献

外文文献
中文文献
专利

1. The influence of visual training on predicting complex action sequences [J] . CrossE.S., StadlerW., ParkinsonJ., Human brain mapping . 2013,第2期

机译：视觉训练对预测复杂动作序列的影响
2. Observational training in visual half-fields and the coding of movement sequences [J] . EllenbuergerT., BoutinA., PanzerS., Human movement science . 2012,第6期

机译：视觉半场中的观察训练和运动序列的编码
3. Training Medical Novices in Spinal Microsurgery: Does the Modality Matter? A Pilot Study Comparing Traditional Microscopic Surgery and a Novel Robotic Optoelectronic Visualization Tool [J] . Marc Moisi, R. Shane Tubbs, Jeni Page, Cureus. . 2016,第1期

机译：在脊柱显微外科手术中训练医学新手：这种方式重要吗？传统显微镜手术与新型机器人光电可视化工具的比较研究
4. Making History Matter: History-Advantage Sequence Training for Visual Dialog [C] . Tianhao Yang, Zheng-Jun Zha, Hanwang Zhang International Conference on Computer Vision . 2019

机译：使历史变得重要：视觉对话的历史优势序列训练
5. The place of history in Platonic philosophy: The consequences for history of Plato's middle dialogues. [D] . Al-Nakeeb, Mustafa Shaheen. 1999

机译：历史在柏拉图哲学中的位置：柏拉图中间对话历史的后果。
6. Reply to Programming may matter most. Response to Metabolic effects of two high-intensity circuit training protocols: Does sequence matter? [O] . Tony P. Nuñez, Fabiano T. Amorim, Nicholas M. Beltz, 2021

机译：回复最重要的编程。响应两个高强度电路训练协议的代谢效应：序列关系？
7. An improved interface for tutorial dialogues: browsing a visual dialogue history [O] . Lemaire, Benoît, Moore, Johanna D. 1994

机译：教学对话的改进界面：浏览视觉对话历史

Making History Matter: History-Advantage Sequence Training for Visual Dialog

摘要

著录项

相似文献

相关主题

期刊订阅