首页> 外文期刊>International journal of management science and engineering management >Bayes and big data: the consensus Monte Carlo algorithm
【24h】

Bayes and big data: the consensus Monte Carlo algorithm

机译:贝叶斯与大数据:共识蒙特卡洛算法

获取原文
获取原文并翻译 | 示例
           

摘要

A useful definition of 'big data' is data that is too big to process comfortably on a single machine, either because of processor, memory, or disk bottlenecks. Graphics processing units can alleviate the processor bottleneck, but memory or disk bottlenecks can only be eliminated by splitting data across multiple machines. Communication between large numbers of machines is expensive (regardless of the amount of data being communicated), so there is a need for algorithms that perform distributed approximate Bayesian analyses with minimal communication. Consensus Monte Carlo operates by running a separate Monte Carlo algorithm on each machine, and then averaging individual Monte Carlo draws across machines. Depending on the model, the resulting draws can be nearly indistinguishable from the draws that would have been obtained by running a single-machine algorithm for a very long time. Examples of consensus Monte Carlo are shown for simple models where single-machine solutions are available, for large single-layer hierarchical models, and for Bayesian additive regression trees (BART).
机译:“大数据”的一个有用定义是由于处理器,内存或磁盘瓶颈而导致无法在单个计算机上舒适地处理的数据。图形处理单元可以缓解处理器瓶颈,但是只有通过在多台计算机之间拆分数据,才能消除内存或磁盘瓶颈。大量机器之间的通信非常昂贵(与要传输的数据量无关),因此需要一种算法以最少的通信量执行分布式近似贝叶斯分析。共识性蒙特卡洛的运作方式是在每台机器上运行单独的蒙特卡洛算法,然后平均各个机器上的单个蒙特卡洛绘制。根据模型的不同,生成的绘图可能与通过长时间运行单机算法获得的绘图几乎没有区别。针对具有单机解决方案的简单模型,大型单层层次模型以及贝叶斯加性回归树(BART),显示了共识蒙特卡洛的示例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号