Profit sharing that can learn deterministic policy for POMDPs environments

机译：可以学习POMDP环境的确定性策略的利润共享

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, we propose a Profit Sharing that can learn deterministic policy for POMDPs environments. The proposed method can obtain the deterministic policy by using the history of observations. In the proposed method, the states in the perceptual aliasing are detected. Here, the perceptual aliasing means that different states can be perceived as the same. In the states in the perceptual aliasing, the action is selected based on the history of observations. In order to use the history of observations in the action selection, the rules of observation sequences and their values are defined. In the proposed method, the deterministic policy can be learn finally by considering the history of observations if needed. We carried out a series of computer experiments, and confirmed that the proposed method can detect the states in perceptual aliasing in the POMDPs environment, and can obtain the deterministic policy using the values for the rules of observation sequences.

机译：在本文中，我们提出了可以学习POMDP环境的确定性策略的利润共享。所提出的方法可以利用观测历史来获得确定性策略。在所提出的方法中，检测了感知混叠中的状态。在这里，感知混叠意味着可以将不同的状态视为相同。在感知混叠中的状态下，将根据观察的历史选择动作。为了在操作选择中使用观察历史，定义了观察序列的规则及其值。在提出的方法中，如果需要的话，可以通过考虑观测历史来最终学习确定性策略。我们进行了一系列的计算机实验，并证实了该方法可以检测POMDPs环境中感知混叠中的状态，并可以使用观测序列规则的值获得确定性策略。

著录项

来源
《2011 IEEE International Conference on Systems, Man, and Cybernetics》|2011年|p.490-495|共6页
会议地点
作者

展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化系统理论;
关键词

相似文献

外文文献
中文文献
专利

1. BOTTOM-UP LEARNING OF HIERARCHICAL MODELS IN A CLASS OF DETERMINISTIC POMDP ENVIRONMENTS [J] . Itoh Hideaki, Fukumoto Hisao, Wakuya Hiroshi, International Journal of Applied Mathematics and Computer Science . 2015,第3期

机译：一类确定性POMDP环境中的层次模型的自底向上学习
2. Bottom–Up Learning of Hierarchical Models in a Class of Deterministic Pomdp Environments [J] . class, Hisao Fukumoto, Hiroshi Wakuya, International journal of applied mathematics and computer science . 2015,第3期

机译：一类确定性Pomdp环境中的层次模型的自底向上学习
3. Continuous shared control in prosthetic hand grasp tasks by Deep Deterministic Policy Gradient with Hindsight Experience Replay [J] . Zhaolong Gao, Rongyu Tang, Luyao Chen, International Journal of Advanced Robotic Systems . 2020,第4期

机译：通过深度确定性政策梯度与后敏感体验重放的持续共享控制掌握任务
4. Fall avoidance of bipedalwalking robot by profit sharing that can learn deterministic policy for POMDPs environments [C] . Suzuki Toshihiro, Osana Yuko World Congress on Nature and Biologically Inspired Computing . 2014

机译：通过利润共享避免双足步行机器人跌倒，可以学习POMDP环境的确定性策略
5. Shared reading of self-selected books with kindergartners: What can students and their teacher learn? [D] . Peterson, Jessica Lynn. 2010

机译：与幼儿园的学生共同阅读自选书籍：学生及其老师可以学到什么？
6. Application Description and Policy Model in Collaborative Environment for Sharing of Information on Epidemiological and Clinical Research Data Sets [O] . Elias César Araujo de Carvalho, Adelia Portero Batilana, Julie Simkins, 2010

机译：协作环境中流行病学和临床研究数据集信息共享的应用程序描述和策略模型
7. Bottom-up learning of hierarchical models in a class of deterministic POMDP environments [O] . Itoh Hideaki, Fukumoto Hisao, Wakuya Hiroshi, 2015

机译：在一类确定性pOmDp环境中自下而上学习层次模型

Profit sharing that can learn deterministic policy for POMDPs environments

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅