首页> 外文期刊>Queueing systems >Reward maximization in general dynamic matching systems
【24h】

Reward maximization in general dynamic matching systems

机译:通用动态匹配系统中的奖励最大化

获取原文
获取原文并翻译 | 示例
           

摘要

We consider a matching system with random arrivals of items of different types. The items wait in queuesone per item typeuntil they are matched. Each matching requires certain quantities of items of different types; after a matching is activated, the associated items leave the system. There exists a finite set of possible matchings, each producing a certain amount of reward. This model has a broad range of important applications, including assemble-to-order systems, Internet advertising, and matching web portals. We propose an optimal matching scheme in the sense that it asymptotically maximizes the long-term average matching reward, while keeping the queues stable. The scheme makes matching decisions in a specially constructed virtual system, which in turn controls decisions in the physical system. The key feature of the virtual system is that, unlike the physical one, it allows the queues to become negative. The matchings in the virtual system are controlled by an extended version of the greedy primal-dual (GPD) algorithm, which we prove to be asymptotically optimalthis in turn implies the asymptotic optimality of the entire scheme. The scheme is real time; at any time, it uses simple rules based on the current state of the virtual and physical queues. It is very robust in that it does not require any knowledge of the item arrival rates and automatically adapts to changing rates. The extended GPD algorithm and its asymptotic optimality apply to a quite general queueing network framework, not limited to matching problems, and therefore are of independent interest.
机译:我们考虑一个匹配系统,其中不同类型的物品随机到达。每种项目类型的项目在队列中等待的时间直到它们被匹配为止。每个匹配项都需要一定数量的不同类型的项目;激活匹配后,关联的项目将离开系统。存在一组有限的可能匹配项,每个匹配项都会产生一定量的奖励。该模型具有广泛的重要应用程序,包括按订单组装系统,Internet广告和匹配的Web门户。在保持队列稳定的同时,渐近最大化长期平均匹配奖励的意义上,我们提出了一种最佳匹配方案。该方案在专门构建的虚拟系统中做出匹配的决策,该虚拟系统又控制物理系统中的决策。虚拟系统的关键功能是,与物理系统不同,它允许队列变为负数。虚拟系统中的匹配由贪婪原始对偶(GPD)算法的扩展版本控制,我们证明它是渐近最优的,这又暗示了整个方案的渐近最优性。该方案是实时的;在任何时候,它都会根据虚拟和物理队列的当前状态使用简单的规则。它非常强大,因为它不需要任何物品到达速度的知识,并且可以自动适应变化的速度。扩展的GPD算法及其渐近最优性适用于相当普遍的排队网络框架,而不仅限于匹配问题,因此具有独立的意义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号