Fully-Asynchronous Cache-Efficient Simulation of Detailed Neural Networks

机译：详细神经网络的全异步高速缓存有效仿真

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Modern asynchronous runtime systems allow the re-thinking of large-scale scientific applications. With the example of a simulator of morphologically detailed neural networks, we show how detaching from the commonly used bulk-synchronous parallel (BSP) execution allows for the increase of prefetching capabilities, better cache locality, and a overlap of computation and communication, consequently leading to a lower time to solution. Our strategy removes the operation of collective synchronization of ODEs' coupling information, and takes advantage of the pairwise time dependency between equations, leading to a fully-asynchronous exhaustive yet not speculative stepping model. Combined with fully linear data structures, communication reduce at compute node level, and an earliest equation steps first scheduler, we perform an acceleration at the cache level that reduces communication and time to solution by maximizing the number of timesteps taken per neuron at each iteration. Our methods were implemented on the core kernel of the NEURON scientific application. Asynchronicity and distributed memory space are provided by the HPX runtime system for the ParalleX execution model. Benchmark results demonstrate a superlinear speed-up that leads to a reduced runtime compared to the bulk synchronous execution, yielding a speed-up between 25% to 65% across different compute architectures, and in the order of 15% to 40% for distributed executions.

机译：现代异步运行时系统允许对大规模科学应用程序进行重新思考。以形态学详细的神经网络的模拟器为例，我们展示了与常用的批量同步并行（BSP）执行分离如何允许增加预取功能，更好的缓存局部性以及计算和通信的重叠，从而导致缩短解决时间。我们的策略消除了ODE耦合信息的集体同步操作，并利用了方程之间的成对时间依赖性，从而形成了完全异步的穷举而不是推测性的步进模型。结合完全线性的数据结构，在计算节点级别减少通信，并在最早的方程式步骤调度程序中进行加速，我们在高速缓存级别执行加速，从而通过最大化每次迭代中每个神经元采取的时间步数来减少通信和解决问题的时间。我们的方法是在NEURON科学应用程序的核心内核上实现的。 HPX运行时系统为ParalleX执行模型提供了异步性和分布式内存空间。基准测试结果表明，与批量同步执行相比，超线性加速可减少运行时间，跨不同计算体系结构的加速速度可提高25％至65％，而分布式执行的速度可提高15％至40％。

著录项

来源
《International Conference on Computational Science》|2019年|421-434|共12页
会议地点
作者
Bruno R. C. Magalhaes; Thomas Sterling; Michael Hines; Felix Schuermann;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. See - A framework for simulation of biologically detailed and artificial neural networks and systems [J] . Mikael Djurfeldt, Anders Sandberg, Orjan Ekeberg Neurocomputing . 1999,第期

机译：参见-用于模拟生物学详细的人工神经网络和系统的框架
2. Comparison between detailed model simulation and artificial neural network for forecasting building energy consumption [J] . Alberto Hernandez Neto, Flavio Augusto Sanzovo Fiorelli Energy and Buildings . 2008,第12期

机译：详细模型仿真与人工神经网络预测建筑能耗的比较
3. On the anatomical definition of arterial networks in blood flow simulations: comparison of detailed and simplified models [J] . Blanco Pablo J., Mueller Lucas O., Watanabe Sansuke M., Biomechanics and modeling in mechanobiology . 2020,第5期

机译：论血流模拟中动脉网络的解剖定义：详细和简化模型的比较
4. Fully-Asynchronous Cache-Efficient Simulation of Detailed Neural Networks [C] . Bruno R. C. Magalhaes, Thomas Sterling, Michael Hines, International Conference on Computational Science . 2019

机译：完全异步缓存的详细神经网络的高效模拟
5. Combining neural networks and Tabu search in a fast neural network simulation for combinatorial optimization. [D] . Magent, Michael Andrew. 1996

机译：在快速神经网络仿真中将神经网络和禁忌搜索相结合，以进行组合优化。
6. Modeling the Generation of Phase-Amplitude Coupling in Cortical Circuits: From Detailed Networks to Neural Mass Models [O] . Roberto C. Sotero -1

机译：建模皮质电路中的相-幅耦合：从详细的网络到神经质量模型
7. Fully-Asynchronous Fully-Implicit Variable-Order Variable-Timestep Simulation of Neural Networks [O] . Bruno Magalhães, Michael Hines, Thomas Sterling, 2020

机译：全同步全隐式的可变订购变量 - 神经网络的模拟

Fully-Asynchronous Cache-Efficient Simulation of Detailed Neural Networks

摘要

著录项

相似文献

相关主题

期刊订阅