首页> 外国专利> Online temporal difference learning from incomplete customer interaction histories

Online temporal difference learning from incomplete customer interaction histories

机译：从不完整的客户互动历史中在线时差学习

页面导航

摘要
著录项
相似文献

摘要

In one embodiment, an indication that a decision has been requested, selected, or applied with respect to one or more users may be obtained. After the indication that a decision that has been requested, selected, or applied is obtained, a value function may be updated, where the value function approximates an expected reward associated with the one or more users over time since the decision has been requested, selected, or applied with respect to the one or more users. The value function may be updated by performing or providing one or more updates to the value function, where a time at which each of the one or more updates is performed or provided is independent of activity of the one or more users.

机译：在一个实施例中，可以获得关于一个或多个用户已经请求，选择或应用了决定的指示。在获得已经被请求，选择或应用的决定的指示之后，可以更新价值函数，其中该价值函数近似于自从请求，选择该决定以来随着时间的推移与一个或多个用户相关联的预期奖励。，或针对一个或多个用户应用。可以通过执行或提供对价值函数的一个或多个更新来更新价值函数，其中，执行或提供一个或多个更新中的每一个的时间与一个或多个用户的活动无关。

著录项

公开/公告号US9367820B2

专利类型
公开/公告日2016-06-14

原文格式PDF
申请/专利权人 NICE SYSTEMS TECHNOLOGIES UK LIMITED;
展开▼

申请/专利号US201414571403
发明设计人 LEONARD MICHAEL NEWNHAM;JASON DEREK MCFALL;DAVID J BARKER;DAVID SILVER;
展开▼

申请日2014-12-16
分类号G06N99;G06N5/04;
国家 US
入库时间 2022-08-21 14:31:36

相似文献

专利
外文文献
中文文献