首页> 外文期刊>Annals of Mathematics and Artificial Intelligence >An optimal bidimensional multi-armed bandit auction for multi-unit procurement
【24h】

An optimal bidimensional multi-armed bandit auction for multi-unit procurement

机译:用于多单位采购的最佳二维多臂土匪拍卖

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

We study the problem of a buyer who gains stochastic rewards by procuring through an auction, multiple units of a service or item from a pool of heterogeneous agents who are strategic on two dimensions, namely cost and capacity. The reward obtained for a single unit from an allocated agent depends on the inherent quality of the agent; the agent's quality is fixed but unknown. Each agent can only supply a limited number of units (capacity of the agent). The cost incurred per unit and capacity (maximum number of units that can be supplied) are private information of each agent. The auctioneer is required to elicit from the agents their costs as well as capacities (making the mechanism design bidimensional) and further, learn the qualities of the agents as well, with a view to maximize her utility. Motivated by this, we design a bidimensional multi-armed bandit procurement auction that seeks to maximize the expected utility of the auctioneer subject to incentive compatibility and individual rationality, while simultaneously learning the unknown qualities of the agents. We first work with the assumption that the qualities are known, and propose an optimal, truthful mechanism 2D-OPT for the auctioneer to elicit costs and capacities. Next, in order to learn the qualities of the agents as well, we provide sufficient conditions for a learning algorithm to be Bayesian incentive compatible and individually rational. We finally design a novel learning mechanism, 2D-UCB that is stochastic Bayesian incentive compatible and individually rational.
机译:我们研究了一个买家的问题,该买家通过拍卖,从一组具有成本和容量两个维度的战略性的异质代理商中获得服务或物品的多个单位来获得随机奖励。从分配的代理商获得的单个单位的报酬取决于代理商的固有素质;代理的质量是固定的,但未知。每个代理只能提供有限数量的单元(代理的容量)。每单位和容量(可提供的最大单位数量)所产生的成本是每个代理商的私人信息。拍卖人必须从代理商那里获取成本和能力(使机构设计成为二维),并进一步了解代理商的素质,以期最大程度地发挥其作用。因此,我们设计了一种二维多武装匪徒采购拍卖活动,力求在激励相容性和个人理性的约束下,最大化拍卖人的预期效用,同时了解代理人的未知素质。我们首先假设质量是已知的,然后为拍卖师提出最佳,真实的2D-OPT机制,以得出成本和能力。接下来,为了也了解代理的素质,我们为学习算法具有贝叶斯激励相容性和个体理性提供了充分的条件。我们最终设计了一种新颖的学习机制2D-UCB,它是随机贝叶斯激励兼容且个体合理的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号