Learning a Value Analysis Tool For Agent Evaluation

机译：学习用于代理商评估的价值分析工具

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Evaluating an agent's performance in a stochastic setting is necessary for agent development, scientific evaluation, and competitions. Traditionally, evaluation is done using Monte Carlo estimation; the magnitude of the stochasticity in the domain or the high cost of sampling, however, can often prevent the approach from resulting in statistically significant conclusions. Recently, an advantage sum technique has been proposed for constructing unbiased, low variance estimates of agent performance. The technique requires an expert to define a value function over states of the system, essentially a guess of the state's unknown value. In this work, we propose learning this value function from past interactions between agents in some target population. Our learned value functions have two key advantages: they can be applied in domains where no expert value function is available and they can result in tuned evaluation for a specific population of agents (e.g., novice versus advanced agents). We demonstrate these two advantages in the domain of poker. We show that we can reduce variance over state-of-the-art estimators for a specific population of limit poker players as well as construct the first variance reducing estimators for no-limit poker and multi-player limit poker.

机译：评估代理商在随机环境中的表现对于代理商发展，科学评估和竞争是必不可少的。传统上，评估是使用蒙特卡洛估计进行的;但是，该领域的随机性大小或较高的抽样成本通常会阻止该方法得出具有统计意义的结论。近来，已经提出了一种优势和技术，用于构造代理性能的无偏，低方差估计。该技术需要专家定义系统状态的值函数，实质上是对状态未知值的猜测。在这项工作中，我们建议从某些目标人群中代理商之间的以往互动中学习该价值函数。我们的学习型价值功能具有两个关键优势：它们可以应用于没有专家价值功能可用的领域，并且可以对特定的代理商群体（例如，新手与高级代理商）进行优化的评估。我们在扑克领域展示了这两个优点。我们表明，对于特定的极限扑克玩家群体，我们可以通过最新的估计量减少方差，并为无限制扑克和多玩家极限扑克构造第一个减少方差的估计量。

著录项

来源
《International joint conference on artificial intelligence;IJCAI-09》|2009年|1976-1981|共6页
会议地点
作者
Martha White; Michael Bowling;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. A Model-Based Dose-Response Meta-Analysis of Ocular Hypotensive Agents as a Drug Development Tool to Evaluate New Therapies in Glaucoma [J] . Raber Susan, Mandema Jaap W., Li Hanbin, Journal of ocular pharmacology and therapeutics: The official journal of the Association for Ocular Pharmacology and Therapeutics . 2015,第4期

机译：一种基于模型的眼用降压药剂量反应荟萃分析，作为评估青光眼新疗法的药物开发工具
2. Analysis of change patterns of microcomputed tomography 3-dimensional bone parameters as a high-throughput tool to evaluate antiosteoporotic effects of agents at an early stage of ovariectomy-induced osteoporosis in mice. [J] . Xiang A, Kanematsu M, Mitamura M, Investigative radiology . 2006,第9期

机译：分析微型计算机断层扫描3维骨参数的变化模式，以此作为评估在卵巢切除术诱发的小鼠骨质疏松症早期阶段的药物抗骨质疏松作用的高通量工具。
3. Comparative molecular field analysis as a tool to evaluate mode of action of chemical hybridization agents. [J] . Collantes ER, Xing L, Miller PC, Journal of Agricultural and Food Chemistry . 1999,第12期

机译：比较分子场分析作为评估化学杂交剂作用方式的工具。
4. Learning a Value Analysis Tool For Agent Evaluation [C] . Martha White, Michael Bowling International Joint Conference on Artificial Intelligence . 2009

机译：学习代理评估的值分析工具
5. Diagnosis and inhibition tools in medicinal chemistry. Part I: Synthesis and evaluation of dual wavelength fluorescent benzo[b]thiophene boronic acid derivatives for sugar sensing. Part II: Synthesis of diamidines and arylimidamides for DNA minor groove binders as antiparasitic agents. [D] . Akay, Senol. 2009

机译：药物化学的诊断和抑制工具。第一部分：用于糖传感的双波长荧光苯并[b]噻吩硼酸衍生物的合成和评估。第二部分：用于DNA小沟结合剂的二anti和芳基酰胺的合成。
6. Teaching tools in Evidence Based Practice: evaluation of reusable learning objects (RLOs) for learning about Meta-analysis [O] . Fiona Bath-Hextall, Heather Wharrad, Jo Leonardi-Bee 2011

机译：循证实践中的教学工具：评估可重复使用的学习对象（RLO）以学习有关元分析的知识
7. Learning a value analysis tool for agent evaluation [O] . Martha White, Michael Bowling 1976

机译：学习用于代理商评估的价值分析工具

Learning a Value Analysis Tool For Agent Evaluation

摘要

著录项

相似文献

相关主题

期刊订阅