首页> 外文会议>International Conference on Advanced Communication Technology >A Q-leaming algorithm applied to the behavioural decision-making of affective virtual human
【24h】

A Q-leaming algorithm applied to the behavioural decision-making of affective virtual human

机译:一种Q-LeaMing算法应用于情感虚拟人的行为决策

获取原文

摘要

Traditional Q-Learning algorithm has problems of data transmission lag and its environmental reward model is too simple. It cannot be well applied to the reinforcement learning of affective virtual human behaviour decision. Analogizing the thought of human' s self-reflection in this paper, a improved Q-learning algorithm is proposed, which can be easily applied in behavioural decision-making of affective virtual human. The Q-learning algorithm in this paper not only strengthens the behaviour strategy with better learning cycle and weakens the behaviour strategy with worse learning cycle by the way of self-reflection reward, but also picks up the speed of the effect of behavioural decision feedback to state-action pair in a learning cycle, thus improving the convergence rate of Q-learning algorithm in affective virtual human's behavioural decision-making. The algorithm aims at helping affective virtual human carry out path optimization in a two-dimensional grid environment in the simulation test. The results show that the improved Q-learning algorithm is significantly faster than the traditional Q-learning algorithm in achieving the optimal control strategy with an average of 43.7 learning cycles. The validity of the algorithm is verified.
机译:传统的Q学习算法具有数据传输滞后问题,其环境奖励模型太简单。它不能很好地应用于情感虚拟人类行为决策的强化学习。提出了一种改进的Q学习算法的人类自反射的思想,可以容易地应用于情感虚拟人的行为决策。本文中的Q学习算法不仅加强了具有更好学习周期的行为策略,并通过自我反思奖励的方式削弱了更糟糕的学习周期的行为策略,但也提出了行为判定反馈的效果的速度在学习循环中的状态 - 动作对,从而提高了Q学习算法在情感虚拟人的行为决策中的收敛速率。该算法旨在帮助模拟测试中的二维网格环境中的情感虚拟人进行路径优化。结果表明,改进的Q学习算法比传统的Q学习算法在实现最佳控制策略方面的平均学习周期的最佳控制策略。验证了算法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号