首页> 外国专利> Delayed processing for arm policy determination for content management system messaging

Delayed processing for arm policy determination for content management system messaging

机译:用于内容管理系统消息传递的手臂策略确定的延迟处理

摘要

Computer-implemented techniques include, during a delayed processing window, receiving reward data for arm actions taken, where the arm actions were chosen based on a previous version of an arm choice policy, and the previous version of the arm choice policy was determined based on a previous set of reward data for a previous set of arm actions taken. When the delayed processing window has closed, a new arm choice policy is determined based at least in part on the action-reward data, and the previous set of reward data and/or the previous arm choice policy. After a request to choose an arm choice is received, a particular arm action to take is determined based on the new arm choice policy. This chosen arm is provided in response to the request.
机译:计算机实现的技术包括:在延迟的处理窗口中,接收采取的手臂动作的奖励数据,其中,手臂动作是根据先前版本的手臂选择策略选择的,而手臂手势选择策略的先前版本是根据以下内容确定的:前一组手臂动作的前一组奖励数据。当延迟的处理窗口已经关闭时,至少部分地基于动作奖励数据,先前的奖励数据集和/或先前的手臂选择策略来确定新的手臂选择策略。收到选择布防选择的请求后,将根据新的布防选择策略确定要采取的特定布防操作。响应于请求而提供了该选定的臂。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号