首页> 外文会议>IEEE International Conference on Robotics and Automation >Speeding Up Incremental Learning Using Data Efficient Guided Exploration
【24h】

Speeding Up Incremental Learning Using Data Efficient Guided Exploration

机译:使用数据高效导向探索加快增量学习

获取原文

摘要

To cope with varying conditions, motor primitives (MPs) must support generalization over task parameters to avoid learning separate primitives for each situation. In this regard, deterministic and probabilistic models have been proposed for generalizing MPs to new task parameters, thus providing limited generalization. Although generalization of MPs using probabilistic models has been studied, it is not clear how such generalizable models can be learned efficiently. Reinforcement learning can be more efficient when the exploration process is tuned with data uncertainty, thus reducing unnecessary exploration in a data-efficient way. We propose an empirical Bayes method to predict uncertainty and utilize it for guiding the exploration process of an incremental learning framework. The online incremental learning framework uses a single human demonstration for constructing a database of MPs. The main ingredients of the proposed framework are a global parametric model (GPDMP) for generalizing MPs for new situations, a model-free policy search agent for optimizing the failed predicted MPs, model selection for controlling the complexity of GPDMP, and empirical Bayes for extracting the uncertainty of MPs prediction. Experiments with a ball-in-a-cup task demonstrate that the global GPDMP model generalizes significantly better than linear models and Locally Weighted Regression especially in terms of extrapolation capability. Furthermore, the model selection has successfully identified the required complexity of GPDMP even with few training samples while satisfying the Occam Razor's prinicple. Above all, the uncertainty predicted by the proposed empirical Bayes approach successfully guided the exploration process of the model-free policy search. The experiments indicated statistically significant improvement of learning speed over covariance matrix adaptation (CMA) with a significance of p = 0.002.
机译:为了应对不同的条件,电机基元(MPS)必须支持任务参数的泛化,以避免为每种情况学习单独的基元。在这方面,已经提出了确定性和概率模型,用于将MP概括为新的任务参数,从而提供有限的泛化。尽管研究了使用概率模型的MPS的泛化,但目前尚不清楚如何有效地学习这些更广泛的模型。当探索过程随数据不确定性调整勘探过程时,强化学习可能更有效,从而以数据有效的方式降低了不必要的探索。我们提出了一种经验贝叶斯方法来预测不确定性并利用它来指导增量学习框架的勘探过程。在线增量学习框架使用单个人类演示来构建MPS数据库。所提出的框架的主要成分是全局参数模型(GPDMP),用于概括新情况的MPS,用于优化失败的预测MPS,用于控制GPDMP复杂性的无模型策略搜索代理,以及用于提取的经验贝叶MPS预测的不确定性。带有球形杯前任务的实验表明,全球GPDMP模型概括了比线性模型和局部加权回归概括为外推能力。此外,即使在满足欧洲潮流的春天的情况下,模型选择也成功地确定了GPDMP的所需复杂性。最重要的是,所提出的经验贝叶斯方法预测的不确定性成功引导了无模式策略搜索的探索过程。实验表明,具有协方差矩阵适应(CMA)的学习速度的统计显着改善,其具有P = 0.002的显着性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号