首页> 外文期刊>Autonomous Mental Development, IEEE Transactions on >Interactive Learning in Continuous Multimodal Space: A Bayesian Approach to Action-Based Soft Partitioning and Learning
【24h】

Interactive Learning in Continuous Multimodal Space: A Bayesian Approach to Action-Based Soft Partitioning and Learning

机译:连续多峰空间中的交互式学习:基于动作的软分区和学习的贝叶斯方法

获取原文
获取原文并翻译 | 示例
       

摘要

A probabilistic framework for interactive learning in continuous and multimodal perceptual spaces is proposed. In this framework, the agent learns the task along with adaptive partitioning of its multimodal perceptual space. The learning process is formulated in a Bayesian reinforcement learning setting to facilitate the adaptive partitioning. The partitioning is gradually and softly done using Gaussian distributions. The parameters of distributions are adapted based on the agent's estimate of its actions' expected values. The probabilistic nature of the method results in experience generalization in addition to robustness against uncertainty and noise. To benefit from experience generalization diversity in different perceptual subspaces, the learning is performed in multiple perceptual subspaces—including the original space-in parallel. In every learning step, the policies learned in the subspaces are fused to select the final action. This concurrent learning in multiple spaces and the decision fusion result in faster learning, possibility of adding and/or removing sensors—i.e., gradual expansion or contraction of the perceptual space-, and appropriate robustness against probable failure of or ambiguity in the data of sensors. Results of two sets of simulations in addition to some experiments are reported to demonstrate the key properties of the framework.
机译:提出了一种在连续和多模态感知空间中进行交互式学习的概率框架。在此框架中,代理将学习任务以及其多峰感知空间的自适应分区。在贝叶斯强化学习环境中制定学习过程,以促进自适应分区。使用高斯分布逐渐而柔和地完成分区。分布的参数根据代理对其动作的期望值的估计进行调整。除了针对不确定性和噪声的鲁棒性之外,该方法的概率性质还导致经验推广。为了从不同感知子空间中的经验概括多样性中受益,该学习是在多个感知子空间(包括原始空间)中并行进行的。在每个学习步骤中,将在子空间中学习到的策略融合在一起以选择最终操作。这种在多个空间中的并发学习和决策融合会导致更快的学习,添加和/或删除传感器的可能性(即,感知空间的逐渐扩展或收缩)以及针对传感器数据可能出现故障或模棱两可的适当鲁棒性。除了一些实验,还报告了两组模拟的结果,以证明该框架的关键特性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号