Learning What Information to Give in Partially Observed Domains

Rohan Chitnis; Leslie Pack Kaelbling; Tomas Lozano-Perez

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Learning What Information to Give in Partially Observed Domains

【24h】

Learning What Information to Give in Partially Observed Domains

机译：学习部分观察到的域名提供什么信息

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In many robotic applications, an autonomous agent must act within and explore a partially observed environment that is unobserved by its human team-mate. We consider such a setting in which the agent can, while acting, transmit declarative information to the human that helps them understand aspects of this unseen environment. In this work, we address the algorithmic question of how the agent should plan out what actions to take and what information to transmit. Naturally, one would expect the human to have preferences, which we model information-theoretically by scoring transmitted information based on the change it induces in weighted entropy of the human’s belief state. We formulate this setting as a belief MDP and give a tractable algorithm for solving it approximately. Then, we give an algorithm that allows the agent to learn the human’s preferences online, through exploration. We validate our approach experimentally in simulated discrete and continuous partially observed search-and-recover domains. Visit http://tinyurl.com/chitnis-corl-18 for a supplementary video.

机译：在许多机器人应用中，自主代理人必须在内部行动，并探索由其人类队友不受欢迎的部分观察到的环境。我们考虑这种设置，其中代理商可以在作用时向人类发送声明信息，帮助他们理解这种看不见环境的方面。在这项工作中，我们解决了代理人应该如何规划到哪些行动以及传输的信息的算法问题。当然，人们希望人类具有偏好，从理论上，通过基于改变来评分传输信息，从而在人类信仰状态的加权熵熵中进行评分。我们将此设置作为信仰MDP制定，并提供了一种易于解决它的算法。然后，我们通过探索给出一种允许代理商在线学习人类偏好的算法。我们通过实验验证了模拟离散和连续部分观察到的搜索域的方法。访问http://tinyurl.com/chitnis-corl-18以获得补充视频。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2018年第2010期|共10页
作者
Rohan Chitnis; Leslie Pack Kaelbling; Tomas Lozano-Perez;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Learning hierarchical task network domains from partially observed plan traces [J] . Hankz Hankui Zhuo, Hector Munoz-Avila, Qiang Yang Artificial intelligence . 2014,第jula期

机译：从部分观察到的计划跟踪中学习分层任务网络域
2. Operator Inference of Non-Markovian Terms for Learning Reduced Models from Partially Observed State Trajectories [J] . Uy Wayne Isaac Tan, Peherstorfer Benjamin Journal of Scientific Computing . 2021,第3期

机译：非马洛维亚学习术语的操作员推理从部分观察到的状态轨迹的减少模型
3. No-Regret Learning from Partially Observed Data in Repeated Auctions ? [J] . Orcun Karaca, Pier Giuseppe Sessa, Anna Leidi, IFAC PapersOnLine . 2020,第2期

机译：在重复拍卖中的部分观察到的数据中没有遗憾？
4. Learning partially observed meshed distribution grids [C] . Harish Doddi, Deepjyoti Deka, Murti Salapaka IEEE International Conference on Probabilistic Methods Applied to Power Systems . 2020

机译：学习部分观察的网格化分布网格
5. A Machine Learning Based High-Speed State Estimator for Partially Observed Electric Transmission Systems [D] . Chandrasekaran, Harish. 2020

机译：基于机器学习的电动传输系统的高速状态估计
6. To observe or not to observe peers when learning physical examination skills; that is the question [O] . Bernard Martineau, Sílvia Mamede, Christina St-Onge, 2013

机译：学习身体检查技巧时观察或不观察同伴；就是那个问题
7. Learning Hierarchical Task Network Domains from Partially Observed Plan Traces [O] . Hankz Hankui Zhuoa, Qiang Yangc 2015

机译：从部分观察的计划跟踪中学习分层任务网络域

Learning What Information to Give in Partially Observed Domains

摘要

著录项

相似文献

相关主题

期刊订阅