Dynamic Packaging In E-retailing With Stochastic Demand Over Finite Horizons: A Q-learning Approach

Yan Cheng

首页> 外文期刊>Expert systems with applications >Dynamic Packaging In E-retailing With Stochastic Demand Over Finite Horizons: A Q-learning Approach

【24h】

Dynamic Packaging In E-retailing With Stochastic Demand Over Finite Horizons: A Q-learning Approach

机译：具有有限需求的随机需求的电子零售中的动态包装：一种Q学习方法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper investigates how intelligent an agent may utilize a Q-learning approach, a simulation-based stochastic technique, to make optimal dynamic packaging decision in e-retailing setting. When the practical application of dynamic packaging involves a large number of products, normal Q-learning approach would encounter two major problems due to excessively large state space. First, learning the Q-values in tabular form may be infeasible because of the excessive amount of memory needed to store the table. Second, rewards in the state space may be so sparse that with random exploration they will only be discovered extremely slowly. This paper first describes the state-dependent and event-driven nature of the dynamic packaging problem with a Markov decision process model, then proposes a states generalization approach based on distortion measure, and finally puts forward a heuristic based exploration/exploitation policy which is used to improve the convergence in Q-learning. We validate our approach in a simulated test.

机译：本文研究了智能代理商如何利用Q学习方法（一种基于模拟的随机技术）在电子零售环境中做出最佳动态包装决策。当动态包装的实际应用涉及大量产品时，由于状态空间过大，常规的Q学习方法会遇到两个主要问题。首先，以表格形式学习Q值可能不可行，因为存储表需要过多的内存。其次，状态空间中的奖励可能会如此稀疏，以至于通过随机探索它们只会非常缓慢地被发现。本文首先利用马尔可夫决策过程模型描述了动态包装问题的状态依赖和事件驱动性质，然后提出了一种基于失真度量的状态归纳方法，最后提出了一种基于启发式的探索/开发策略。改善Q学习的收敛性。我们在模拟测试中验证了我们的方法。

著录项

来源
《Expert systems with applications》 |2009年第1期|472-480|共9页
作者
Yan Cheng;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
e-retailing; dynamic packaging; q-learning;

机译：电子零售;动态包装;q学习;

相似文献

外文文献
中文文献
专利

1. FINITE-HORIZON OPTIMAL CONTROL OF DISCRETE-TIME LINEAR SYSTEMS WITH COMPLETELY UNKNOWN DYNAMICS USING Q-LEARNING [J] . Zhao Jingang, Zhang Chi Journal of industrial and management optimization . 2021,第3期

机译：使用Q-Learning完全未知动态的离散时间线性系统的有限视线最优控制
2. An algebraic expression of finite horizon optimal control algorithm for stochastic logical dynamical systems [J] . Wu Yuhu, Shen Tielong Systems and Control Letters . 2015,第Null期

机译：随机逻辑动力系统有限水平最优控制算法的代数表达式
3. New combined method for solving the single level capacitated production planning model with set up cost, finite horizon and discrete stochastic demand [J] . Seyed Saeid Hashemin, Elham Mohammadi International Journal of Economics, Finance and Management Sciences . 2014,第3期

机译：设置成本，有限时限和离散随机需求的求解单层容量生产计划模型的新组合方法
4. Real Time Demand Learning-Based Q-learning Approach for Dynamic Pricing in E-retailing Setting [C] . Yan Cheng Information Engineering and Electronic Commerce, 2009. IEEC '09 . 2009

机译：电子零售环境中基于实时需求学习的动态定价Q学习方法
5. A finite-horizon inventory problem with deterministic, dynamic demand and stochastic supply disruptions. [D] . Pornsing, Choosak. 2010

机译：具有确定性，动态需求和随机供应中断的有限水平库存问题。
6. Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning [O] . Shota Ohnishi, Eiji Uchibe, Yotaro Yamaguchi, 2019

机译：受约束的深度Q学习逐渐接近普通Q学习
7. Finite-horizon optimal control of discrete-time linear systems with completely unknown dynamics using Q-learning [O] . Jingang Zhao, Chi Zhang 2017

机译：使用Q-Learning完全未知动态的离散时间线性系统的有限视线最优控制

Dynamic Packaging In E-retailing With Stochastic Demand Over Finite Horizons: A Q-learning Approach

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅