Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

机译：学习具有深度加强学习的合作视觉对话代理

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We introduce the first goal-driven training for visual question answering and dialog agents. Specifically, we pose a cooperative 'image guessing' game between two agents - Q-BOT and A-BOT- who communicate in natural language dialog so that Q-BOT can select an unseen image from a lineup of images. We use deep reinforcement learning (RL) to learn the policies of these agents end-to-end -from pixels to multi-agent multi-round dialog to game reward. We demonstrate two experimental results. First, as a 'sanity check' demonstration of pure RL (from scratch), we show results on a synthetic world, where the agents communicate in ungrounded vocabularies, i.e., symbols with no pre-specified meanings (X, Y, Z). We find that two bots invent their own communication protocol and start using certain symbols to ask/answer about certain visual attributes (shape/color/style). Thus, we demonstrate the emergence of grounded language and communication among 'visual' dialog agents with no human supervision. Second, we conduct large-scale real-image experiments on the VisDial dataset [5], where we pretrain on dialog data with supervised learning (SL) and show that the RL fine-tuned agents significantly outperform supervised pretraining. Interestingly, the RL Q-BOT learns to ask questions that A-BOT is good at, ultimately resulting in more informative dialog and a better team.

机译：我们引入了视觉问答和对话代理的第一个目标为导向的培训。具体来说，我们提出了合作的形象猜测“两个代理之间的博弈 - Q-BOT和A-BOT-谁在自然语言对话交流，使Q-BOT可以从图像的阵容选择一个看不见的图像。我们使用深强化学习（RL），以了解这些代理端至端 - 从像素到多主体多轮对话的策略游戏奖励。我们演示两种实验结果。首先，作为纯RL（从头开始）的“完整性检查”演示中，我们示出了具有不预先指定的含义（X，Y，Z）上的合成的世界的结果，其中药剂在不接地的词汇进行通信，即，符号。我们发现有两个机器人发明自己的通信协议，并开始使用某些符号问/对某些视觉属性（形状/颜色/样式）的答案。因此，我们展示了“可视化”，没有人监督对话框代理人之间接地的语言和沟通的出现。其次，我们在VisDial数据集[5]，在那里我们pretrain与监督学习（SL），并表明，RL微调剂显著跑赢监督训练前对话框数据进行大规模实像的实验。有趣的是，RL Q-BOT学会问问题的是A-BOT擅长，最终导致更多的信息对话框，一支更好的球队。

著录项

来源
《IEEE International Conference on Computer Vision》|2017年|2970-3705p|共10页
会议地点
作者
Abhishek Das; Satwik Kottur; Jose M. F. Moura; Stefan Lee; Dhruv Batra;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.41-53;
关键词

相似文献

外文文献
中文文献
专利

1. Coordinated behavior of cooperative agents using deep reinforcement learning [J] . Diallo Elhadji Amadou Oury, Sugiyama Ayumi, Sugawara Toshiharu Neurocomputing . 2020,第Jul5期

机译：利用深增强学习合作代理的协调行为
2. Multi-Agent Deep Reinforcement Learning-Based Cooperative Edge Caching for Ultra-Dense Next-Generation Networks [J] . Chen Shuangwu, Yao Zhen, Jiang Xiaofeng, IEEE Transactions on Communications . 2021,第4期

机译：基于多功能深度加强学习的合作边缘缓存超密集的下一代网络
3. Cooperative Management for PV/ESS-Enabled Electric Vehicle Charging Stations: A Multiagent Deep Reinforcement Learning Approach [J] . Shin MyungJae, Choi Dae-Hyun, Kim Joongheon IEEE transactions on industrial informatics . 2020,第5期

机译：支持PV / ESS的电动车辆充电站的合作管理：多源深度增强学习方法
4. Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning [C] . Abhishek Das, Satwik Kottur, Jose M. F. Moura, IEEE International Conference on Computer Vision . 2017

机译：学习具有深度加强学习的合作视觉对话代理
5. Macro-Action-Based Multi-Agent Deep Reinforcement Learning in Cooperative Tasks [D] . Lu, Xingyu. 2021

机译：基于宏观动作的多智能经济型深度加强学习合作任务
6. Perspective Taking in Deep Reinforcement Learning Agents [O] . Aqeel Labash, Jaan Aru, Tambet Matiisen, 2020

机译：采取深度加强学习代理的透视
7. Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning [O] . Das, Abhishek, Kottur, Satwik, Moura, José M. F., 2017

机译：学习深层加强的协同视觉对话代理学习

Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

摘要

著录项

相似文献

相关主题

期刊订阅