首页> 外文期刊>Future generation computer systems >A gradient-based reinforcement learning approach to dynamic pricing in partially-observable environments
【24h】

A gradient-based reinforcement learning approach to dynamic pricing in partially-observable environments

机译:在部分可观察的环境中基于梯度的强化学习方法进行动态定价

获取原文
获取原文并翻译 | 示例

摘要

As more companies are beginning to adopt the e-business model, it becomes easier for buyers to compare prices at multiple sellers and choose the one that charges the best price for the same item or service. As a result, the demand for the goods of a particular seller is becoming more unstable, since other sellers are regularly offering discounts that attract large fractions of buyers. Therefore, it becomes more important for each seller to switch from static to dynamic pricing policies that take into account observable characteristics of the current demand and the state of the seller's resources. This paper presents a Reinforcement Learning algorithm that can tune parameters of a seller's dynamic pricing policy in a gradient direction (thus converging to the optimal parameter values that maximize the revenue obtained by the seller) even when the seller's environment is not fully observable. This algorithm is evaluated using a simulated Grid market environment, where customers choose a Grid Service Provider (GSP) to which they want to submit a computing job based on the posted price and expected delay information at each GSP.
机译:随着越来越多的公司开始采用电子商务模式,购买者可以更轻松地比较多个卖方的价格,并选择对同一商品或服务收取最佳价格的一种。结果,对特定卖方的商品的需求变得更加不稳定,因为其他卖方通常会提供折扣以吸引大量的买方。因此,对于每个卖方来说,考虑到当前需求的可观察特征和卖方资源状态而从静态定价策略转换为动态定价策略变得更加重要。本文提出了一种强化学习算法,即使在卖方环境不能完全观察到的情况下,该算法也可以在梯度方向上调整卖方动态定价策略的参数(从而收敛到使卖方获得最大收益的最佳参数值)。使用模拟的网格市场环境评估该算法,在该市场环境中,客户根据发布的价格和每个GSP的预期延迟信息选择要向其提交计算任务的网格服务提供商(GSP)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号