PERFORMANCE ANALYSIS OF A 240 THREAD TOURNAMENT LEVEL MCTS GO PROGRAM ON THE INTEL XEON PHI

机译：英特尔XEON PHI上的240线程竞技水平MCTS GO程序的性能分析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In 2013 Intel introduced the Xeon Phi, a new parallel coprocessor board. The Xeon Phi is a cache-coherent many-core shared memory architecture claiming CPU-like versatility, programmability, high performance, and power efficiency. The first published micro-benchmark studies indicate that many of Intel's claims appear to be true. The current paper is the first study on the Phi of a complex artificial intelligence application. It contains an open source MCTS application for playing tournament quality Go (an oriental board game). We report the first speedup figures for up to 240 parallel threads on a real machine, allowing a direct comparison to previous simulation studies. After a substantial amount of work, we observed that performance scales well up to 32 threads, largely confirming previous simulation results of this Go program, although the performance surprisingly deteriorates between 32 and 240 threads. Furthermore, we report (1) unexpected performance anomalies between the Xeon Phi and Xeon CPU for small problem sizes and small numbers of threads, and (2) that performance is sensitive to scheduling choices. Achieving good performance on the Xeon Phi for complex programs is not straightforward; it requires a deep understanding (1) of search patterns, (2) of scheduling, and (3) of the architecture and its many cores and caches. In practice, the Xeon Phi is less straightforward to program for than originally envisioned by Intel.

机译：2013年，英特尔推出了新的并行协处理器板Xeon Phi。 Xeon Phi是一种具有缓存一致性的多核共享内存体系结构，具有类似于CPU的多功能性，可编程性，高性能和能效。首次发布的微基准研究表明，英特尔的许多主张似乎都是正确的。本文是对复杂人工智能应用程序的Phi的第一篇研究。它包含一个用于玩锦标赛质量Go（东方棋盘游戏）的开源MCTS应用程序。我们报告了实际机器上多达240个并行线程的第一个加速数据，可以直接与以前的仿真研究进行比较。经过大量工作，我们观察到性能可以扩展到32个线程，这很大程度上证实了该Go程序的先前仿真结果，尽管性能出乎意料地在32到240个线程之间下降。此外，我们报告了（1）对于小问题大小和少量线程，Xeon Phi和Xeon CPU之间出现了意外的性能异常;以及（2）性能对计划选择很敏感。在Xeon Phi上实现复杂程序的良好性能并不容易。它需要深入了解（1）搜索模式，（2）调度以及（3）体系结构及其许多核心和缓存。实际上，Xeon Phi的编程要比Intel最初设想的要简单。

著录项

来源
《European simulation and modelling conference》|2014年|88-94|共7页
会议地点
作者
S. Ali Mirsoleimani; Aske Plaat; Jos Vermaseren; Jaap van den Herik;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Distributed and Parallel Systems Simulation; Simulation Fidelity and Performance Evaluation; Monte Carlo Methods; Special Architectures; Experimental and Comparative Studies; Memory access patterns; Game playing;

机译：分布式和并行系统仿真;仿真保真度和性能评估;蒙特卡洛方法;特殊架构;实验和比较研究;内存访问模式;游戏玩法;

相似文献

外文文献
中文文献
专利

1. Intel Xeon Phi Coprocessor High Performance Programming [J] . James Jeffers, James ReindersMorgan Kaufmann, 2013ISBN 978-0124104143By Andres More Journal of Computer Science and Technology . 2013,第2期

机译：英特尔至强融核协处理器高性能编程
2. Balancing task- and data-level parallelism to improve performance and energy consumption of matrix computations on the Intel Xeon Phi [J] . Dolz Manuel F., Igual Francisco D., Ludwig Thomas, Computers and Electrical Engineering . 2015,第Null期

机译：平衡任务和数据级别的并行性，以提高Intel Xeon Phi上矩阵计算的性能和能耗
3. Balancing task- and data-level parallelism to improve performance and energy consumption of matrix computations on the Intel Xeon Phi [J] . Dolz Manuel F., Igual Francisco D., Ludwig Thomas, Computers and Electrical Engineering . 2015,第Null期

机译：平衡任务和数据级别的并行性，以提高Intel Xeon Phi上矩阵计算的性能和能耗
4. PERFORMANCE ANALYSIS OF A 240 THREAD TOURNAMENT LEVEL MCTS GO PROGRAM ON THE INTEL XEON PHI [C] . S. Ali Mirsoleimani, Aske Plaat, Jos Vermaseren, European simulation and modelling conference . 2014

机译：240线程锦标赛级MCTS在英特尔Xeon Phi上的绩效分析
5. An Analysis of Variation Between Cores for Intel Xeon Phi Knights Corner and Xeon Phi Knights Landing. [D] . Robinson, Jamar. 2017

机译：英特尔至强披披骑士角和至强披披骑士登陆的内核之间的差异分析。
6. Comparative Performance Analysis of Intel Xeon Phi GPU and CPU: A Case Study from Microscopy Image Analysis [O] . George Teodoro, Tahsin Kurc, Jun Kong, -1

机译：英特尔至强融核GPU和CPU的比较性能分析：以显微镜图像分析为例
7. Performance analysis of a 240 thread tournament level MCTS Go program on the Intel Xeon Phi [O] . Mirsoleimani, S. Ali, Plaat, Aske, Vermaseren, Jos, 2014

机译：一个240线程锦标赛级mCTs Go程序的性能分析英特尔至强披风

PERFORMANCE ANALYSIS OF A 240 THREAD TOURNAMENT LEVEL MCTS GO PROGRAM ON THE INTEL XEON PHI

摘要

著录项

相似文献

相关主题

期刊订阅