首页> 外文会议>Database systems for advanced applications >Bayesian Network Structure Learning from Big Data: A Reservoir Sampling Based Ensemble Method
【24h】

Bayesian Network Structure Learning from Big Data: A Reservoir Sampling Based Ensemble Method

机译:从大数据中学习贝叶斯网络结构:一种基于水库抽样的集成方法

获取原文
获取原文并翻译 | 示例

摘要

Bayesian network (BN) learning from big datasets is potentially more valuable than learning from conventional small datasets as big data contain more comprehensive probability distributions and richer causal relationships. However, learning BNs from big datasets requires high computational cost and easily ends in failure, especially when the learning task is performed on a conventional computation platform. This paper addresses the issue of BN structure learning from a big dataset on a conventional computation platform, and proposes a reservoir sampling based ensemble method (RSEM). In RSEM, a greedy algorithm is used to determine an appropriate size of sub datasets to be extracted from the big dataset. A fast reservoir sampling method is then adopted to efficiently extract sub datasets in one pass. Lastly, a weighted adjacent matrix based ensemble method is employed to produce the final BN structure. Experimental results on both synthetic and real-world big datasets show that RSEM can perform BN structure learning in an accurate and efficient way.
机译:从大数据集学习贝叶斯网络(BN)可能比从传统的小数据集学习更有价值,因为大数据包含更全面的概率分布和更丰富的因果关系。然而,从大数据集学习BN需要很高的计算成本,并且容易以失败而告终,尤其是当学习任务是在常规计算平台上执行时。本文讨论了在常规计算平台上从大数据集学习BN结构的问题,并提出了一种基于油藏采样的集成方法(RSEM)。在RSEM中,使用贪心算法确定要从大数据集中提取的子数据集的适当大小。然后采用一种快速的储层采样方法来一次有效地提取子数据集。最后,采用加权相邻矩阵的集成方法生成最终的BN结构。在合成的和真实的大数据集上的实验结果表明,RSEM可以准确有效地执行BN结构学习。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号