首页> 外文期刊>Selected Topics in Signal Processing, IEEE Journal of >A Comprehensive Reinforcement Learning Framework for Dialogue Management Optimization
【24h】

A Comprehensive Reinforcement Learning Framework for Dialogue Management Optimization

机译:用于对话管理优化的综合强化学习框架

获取原文
获取原文并翻译 | 示例

摘要

Reinforcement learning is now an acknowledged approach for optimizing the interaction strategy of spoken dialogue systems. If the first considered algorithms were quite basic (like SARSA), recent works concentrated on more sophisticated methods. More attention has been paid to off-policy learning, dealing with the exploration-exploitation dilemma, sample efficiency or handling non-stationarity. New algorithms have been proposed to address these issues and have been applied to dialogue management. However, each algorithm often solves a single issue at a time, while dialogue systems exhibit all the problems at once. In this paper, we propose to apply the Kalman Temporal Differences (KTD) framework to the problem of dialogue strategy optimization so as to address all these issues in a comprehensive manner with a single framework. Our claims are illustrated by experiments led on two real-world goal-oriented dialogue management frameworks, DIPPER and HIS.
机译:强化学习现在已成为公认的优化口语对话系统互动策略的方法。如果首先考虑的算法很基础(例如SARSA),那么最近的工作集中在更复杂的方法上。对非政策学习,处理勘探开发困境,样本效率或处理非平稳性给予了更多关注。已经提出了解决这些问题的新算法,并将其应用于对话管理。但是,每种算法通常一次只能解决一个问题,而对话系统会一次展示所有问题。在本文中,我们建议将卡尔曼时间差异(KTD)框架应用于对话策略优化问题,以便通过一个框架全面解决所有这些问题。我们在两个以现实世界为目标的对话管理框架DIPPER和HIS上进行的实验说明了我们的主张。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号