首页> 外文期刊>ACM Transactions on Interactive Intelligent Systems >Nonstrict Hierarchical Reinforcement Learning for Interactive Systems and Robots
【24h】

Nonstrict Hierarchical Reinforcement Learning for Interactive Systems and Robots

机译:交互式系统和机器人的非严格层次强化学习

获取原文
获取原文并翻译 | 示例

摘要

Conversational systems and robots that use reinforcement learning for policy optimization in large domains often face the problem of limited scalability. This problem has been addressed either by using function approximation techniques that estimate the approximate true value function of a policy or by using a hierarchical decomposition of a learning task into subtasks. We present a novel approach for dialogue policy optimization that combines the benefits of both hierarchical control and function approximation and that allows flexible transitions between dialogue subtasks to give human users more control over the dialogue. To this end, each reinforcement learning agent in the hierarchy is extended with a subtask transition function and a dynamic state space to allow flexible switching between subdialogues. In addition, the subtask policies are represented with linear function approximation in order to generalize the decision making to situations unseen in training. Our proposed approach is evaluated in an interactive conversational robot that learns to play quiz games. Experimental results, using simulation and real users, provide evidence that our proposed approach can lead to more flexible (natural) interactions than strict hierarchical control and that it is preferred by human users.
机译:在大范围内使用强化学习进行策略优化的会话系统和机器人通常会遇到可扩展性有限的问题。通过使用估计策略的近似真值函数的函数逼近技术或通过将学习任务分解为子任务来解决此问题。我们提出了一种用于对话策略优化的新颖方法,该方法结合了层次控制和功能逼近的优点,并允许在对话子任务之间进行灵活的转换,从而使人类用户对对话具有更多的控制权。为此,层次结构中的每个强化学习代理都扩展了子任务转换功能和动态状态空间,以允许子对话之间的灵活切换。另外,子任务策略用线性函数近似表示,以便将决策制定推广到训练中未见的情况。我们的方法是在交互式对话机器人中进行评估的,该机器人学习玩测验游戏。使用模拟和真实用户的实验结果提供了证据,表明我们提出的方法比严格的层次控制可以导致更灵活的(自然)交互,并且它是人类用户的首选。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号