首页> 外文会议>IEEE Workshop on Spoken Language Technology >Simultaneous feature selection and parameter optimization for training of dialog policy by reinforcement learning
【24h】

Simultaneous feature selection and parameter optimization for training of dialog policy by reinforcement learning

机译:通过加固学习培养对话策略的同时特征选择和参数优化

获取原文

摘要

This paper addresses the problem of feature selection in the reinforcement learning (RL) of the dialog policies of spoken dialog systems. A statistical dialog manager selects the system actions the system should take based on the features derived from the current dialog state and/or the system's belief state. When defining the features used by the system for training the dialog policy, however, finding a set of actually effective features from potentially useful ones is not obvious. In addition, the selection should be done simultaneously with the optimization of the dialog policy. In this paper, we propose an incremental feature selection method for the optimization of a dialog policy by RL, in which improvement of the dialog policy and the feature selection are conducted simultaneously. Experiments in dialog policy optimization by RL with a user simulator demonstrated the following: 1) that the proposed method can find a better dialog policy with fewer policy iterations and 2) the learning speed is comparable with the case where feature selection is conducted in advance.
机译:本文解决了对话策略的强化学习(RL)的功能选择问题。统计对话管理器选择系统操作,系统应基于从当前对话状态和/或系统的信仰状态派生的功能。然而,在定义系统使用的功能以进行培训对话策略时,查找来自潜在有用的系统的实际有效功能并不明显。此外,应使用对话策略的优化同时进行选择。在本文中,我们提出了一个增量特征选择方法,用于通过RL优化对话策略,从中同时进行对话策略和特征选择的改进。通过用户模拟器的RL对话策略优化的实验演示如下:1)所提出的方法可以找到更好的对话策略,策略迭代和2)学习速度与预先进行特征选择的情况相当。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号