...
首页> 外文期刊>International Journal of Approximate Reasoning >Large-scale empirical validation of Bayesian Network structure learning algorithms with noisy data
【24h】

Large-scale empirical validation of Bayesian Network structure learning algorithms with noisy data

机译:贝叶斯网络结构学习算法的大规模实证验证噪声数据

获取原文
获取原文并翻译 | 示例
           

摘要

Numerous Bayesian Network (BN) structure learning algorithms have been proposed in the literature over the past few decades. Each publication makes an empirical or theoretical case for the algorithm proposed in that publication and results across studies are often inconsistent in their claims about which algorithm is 'best'. This is partly because there is no agreed evaluation approach to determine their effectiveness. Moreover, each algorithm is based on a set of assumptions, such as complete data and causal sufficiency, and tend to be evaluated with data that conforms to these assumptions, however unrealistic these assumptions may be in the real world. As a result, it is widely accepted that synthetic performance overestimates real performance, although to what degree this may happen remains unknown. This paper investigates the performance of 15 state-of-the-art, well-established, or recent promising structure learning algorithms. We propose a methodology that applies the algorithms to data that incorporates synthetic noise, in an effort to better understand the performance of structure learning algorithms when applied to real data. Each algorithm is tested over multiple case studies, sample sizes, types of noise, and assessed with multiple evaluation criteria. This work involved learning approximately 10,000 graphs with a total structure learning runtime of seven months. In investigating the impact of data noise, we provide the first large scale empirical comparison of BN structure learning algorithms under different assumptions of data noise. The results suggest that traditional synthetic performance may overestimate real-world performance by anywhere between 10% and more than 50%. They also show that while score-based learning is generally superior to constraint-based learning, a higher fitting score does not necessarily imply a more accurate causal graph. The comparisons extend to other outcomes of interest, such as runtime, reliability, and resilience to noise, assessed over both small and large networks, and with both limited and big data. To facilitate comparisons with future studies, we have made all data, raw results, graphs and BN models freely available online. (C) 2021 The Author(s). Published by Elsevier Inc.
机译:在过去的几十年里,在文献中提出了许多贝叶斯网络(BN)结构学习算法。每个出版物对该出版物提出的算法进行了实证或理论案例,并且跨研究的结果通常不一致,他们的索赔是哪些算法“最好”。这部分是因为没有商定的评估方法来确定其有效性。此外,每种算法基于一组假设,例如完整的数据和因果充足,并且倾向于与符合这些假设的数据进行评估,但是这些假设可能在现实世界中。结果,众所周知,合成性能高估真实性能,尽管在多大程度上可能发生遗骸仍然是未知的。本文调查了15现有技术,既良好的,最近有前途的结构学习算法的性能。我们提出一种方法,该方法将算法应用于包含合成噪声的数据,以便在应用于实际数据时更好地了解结构学习算法的性能。通过多种案例研究,样本尺寸,噪声类型测试每种算法,并评估多种评估标准。这项工作涉及学习大约10,000个图表,具有七个月的总结构学习运行时间。在调查数据噪声的影响时,我们提供了在不同数据噪声的不同假设下BN结构学习算法的第一个大规模实证比较。结果表明,传统的综合性能可能超过10%到50%的任何地方高估真实绩效。他们还表明,虽然基于得分的学习通常优于基于约束的学习,但更高的拟合得分并不一定意味着更准确的因果图。比较延伸到其他兴趣结果,例如运行时,可靠性和对噪声的抵御能力,评估小型和大型网络,以及有限和大数据。为了促进与未来研究的比较,我们已经在线自由提供了所有数据,原始结果,图形和BN型号。 (c)2021提交人。 elsevier公司出版

著录项

  • 来源
    《International Journal of Approximate Reasoning》 |2021年第4期|151-188|共38页
  • 作者单位

    Queen Mary Univ London QMUL Sch EECS Bayesian Artificial Intelligence Res Lab Risk & Informat Management RIM Res Grp London E1 4NS England|Alan Turing Inst British Lib 96 Euston Rd London NW1 2DB England;

    Queen Mary Univ London QMUL Sch EECS Bayesian Artificial Intelligence Res Lab Risk & Informat Management RIM Res Grp London E1 4NS England;

    Queen Mary Univ London QMUL Sch EECS Bayesian Artificial Intelligence Res Lab Risk & Informat Management RIM Res Grp London E1 4NS England;

    Queen Mary Univ London QMUL Sch EECS Bayesian Artificial Intelligence Res Lab Risk & Informat Management RIM Res Grp London E1 4NS England;

    Queen Mary Univ London QMUL Sch EECS Bayesian Artificial Intelligence Res Lab Risk & Informat Management RIM Res Grp London E1 4NS England;

  • 收录信息 美国《科学引文索引》(SCI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Ancestral graphs; Causal discovery; Causal insufficiency; Directed acyclic graphs; Measurement error; Probabilistic graphical models;

    机译:祖先图形;因果发现;因果不足;有向非循环图;测量误差;概率图形模型;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号