A gradient-based reinforcement learning approach to dynamic pricing in partially-observable environments

David Vengerov

首页> 外文期刊>Future generation computer systems >A gradient-based reinforcement learning approach to dynamic pricing in partially-observable environments

【24h】

A gradient-based reinforcement learning approach to dynamic pricing in partially-observable environments

机译：在部分可观察的环境中基于梯度的强化学习方法进行动态定价

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

As more companies are beginning to adopt the e-business model, it becomes easier for buyers to compare prices at multiple sellers and choose the one that charges the best price for the same item or service. As a result, the demand for the goods of a particular seller is becoming more unstable, since other sellers are regularly offering discounts that attract large fractions of buyers. Therefore, it becomes more important for each seller to switch from static to dynamic pricing policies that take into account observable characteristics of the current demand and the state of the seller's resources. This paper presents a Reinforcement Learning algorithm that can tune parameters of a seller's dynamic pricing policy in a gradient direction (thus converging to the optimal parameter values that maximize the revenue obtained by the seller) even when the seller's environment is not fully observable. This algorithm is evaluated using a simulated Grid market environment, where customers choose a Grid Service Provider (GSP) to which they want to submit a computing job based on the posted price and expected delay information at each GSP.

机译：随着越来越多的公司开始采用电子商务模式，购买者可以更轻松地比较多个卖方的价格，并选择对同一商品或服务收取最佳价格的一种。结果，对特定卖方的商品的需求变得更加不稳定，因为其他卖方通常会提供折扣以吸引大量的买方。因此，对于每个卖方来说，考虑到当前需求的可观察特征和卖方资源状态而从静态定价策略转换为动态定价策略变得更加重要。本文提出了一种强化学习算法，即使在卖方环境不能完全观察到的情况下，该算法也可以在梯度方向上调整卖方动态定价策略的参数（从而收敛到使卖方获得最大收益的最佳参数值）。使用模拟的网格市场环境评估该算法，在该市场环境中，客户根据发布的价格和每个GSP的预期延迟信息选择要向其提交计算任务的网格服务提供商（GSP）。

著录项

来源
《Future generation computer systems》 |2008年第7期|687-693|共7页
作者
David Vengerov;
展开▼
作者单位

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
reinforcement learning; dynamic pricing; grid; policy gradient;

机译：强化学习;动态定价;网格政策梯度;

相似文献

外文文献
中文文献
专利

1. Investigating Q-learning approach by using reinforcement learning to decide dynamic pricing for multiple products [J] . Fakhraddin Maroofi International journal of business information systems . 2019,第1期

机译：通过使用强化学习来决定多个产品的动态定价，研究Q学习方法
2. Reinforcement Learning Approach for Navigation of Ground Robotic Platform in Statically and Dynamically Generated Environments [J] . Dmitry Dudarenko, Julia Rubtsova, Artem Kovalev, IFAC PapersOnLine . 2019,第25期

机译：静态和动态生成环境中地面机器人平台导航的强化学习方法
3. Real-time dynamic pricing in a non-stationary environment using model-free reinforcement learning [J] . Rupal Rana, Fernando S. Oliveira Omega . 2014,第sepa期

机译：使用无模型强化学习的非平稳环境中的实时动态定价
4. A Dynamic Pricing Mechanism in IoT for DaaS: A Reinforcement Learning Approach [C] . Binpeng Song, Jinze Song, Jian Ye International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery . 2020

机译：DAAS的IOT动态定价机制：加固学习方法
5. A study of interconnected dynamical systems and reinforcement learning in a multi-agent and distributed environment. [D] . Madera, Manuel. 2012

机译：在多主体和分布式环境中研究相互联系的动力系统和强化学习。
6. Deep Reinforcement Learning Approach with Multiple Experience Pools for UAV’s Autonomous Motion Planning in Complex Unknown Environments [O] . Zijian Hu, Kaifang Wan, Xiaoguang Gao, 2020

机译：具有多种经验库的深度强化学习方法用于复杂未知环境中的无人机自主运动计划
7. An Online Reinforcement Learning Approach for Dynamic Pricing of Electric Vehicle Charging Stations [O] . Valeh Moghaddam, Amirmehdi Yazdani, Hai Wang, 2020

机译：电动汽车充电站动态定价的在线加固学习方法

A gradient-based reinforcement learning approach to dynamic pricing in partially-observable environments

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅