Large-scale empirical validation of Bayesian Network structure learning algorithms with noisy data

Constantinou Anthony C.; Liu Yang; Chobtham Kiattikun; Guo Zhigao; Kitson Neville K.

首页> 外文期刊>International Journal of Approximate Reasoning >Large-scale empirical validation of Bayesian Network structure learning algorithms with noisy data

【24h】

Large-scale empirical validation of Bayesian Network structure learning algorithms with noisy data

机译：贝叶斯网络结构学习算法的大规模实证验证噪声数据

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Numerous Bayesian Network (BN) structure learning algorithms have been proposed in the literature over the past few decades. Each publication makes an empirical or theoretical case for the algorithm proposed in that publication and results across studies are often inconsistent in their claims about which algorithm is 'best'. This is partly because there is no agreed evaluation approach to determine their effectiveness. Moreover, each algorithm is based on a set of assumptions, such as complete data and causal sufficiency, and tend to be evaluated with data that conforms to these assumptions, however unrealistic these assumptions may be in the real world. As a result, it is widely accepted that synthetic performance overestimates real performance, although to what degree this may happen remains unknown. This paper investigates the performance of 15 state-of-the-art, well-established, or recent promising structure learning algorithms. We propose a methodology that applies the algorithms to data that incorporates synthetic noise, in an effort to better understand the performance of structure learning algorithms when applied to real data. Each algorithm is tested over multiple case studies, sample sizes, types of noise, and assessed with multiple evaluation criteria. This work involved learning approximately 10,000 graphs with a total structure learning runtime of seven months. In investigating the impact of data noise, we provide the first large scale empirical comparison of BN structure learning algorithms under different assumptions of data noise. The results suggest that traditional synthetic performance may overestimate real-world performance by anywhere between 10% and more than 50%. They also show that while score-based learning is generally superior to constraint-based learning, a higher fitting score does not necessarily imply a more accurate causal graph. The comparisons extend to other outcomes of interest, such as runtime, reliability, and resilience to noise, assessed over both small and large networks, and with both limited and big data. To facilitate comparisons with future studies, we have made all data, raw results, graphs and BN models freely available online. (C) 2021 The Author(s). Published by Elsevier Inc.

机译：在过去的几十年里，在文献中提出了许多贝叶斯网络（BN）结构学习算法。每个出版物对该出版物提出的算法进行了实证或理论案例，并且跨研究的结果通常不一致，他们的索赔是哪些算法“最好”。这部分是因为没有商定的评估方法来确定其有效性。此外，每种算法基于一组假设，例如完整的数据和因果充足，并且倾向于与符合这些假设的数据进行评估，但是这些假设可能在现实世界中。结果，众所周知，合成性能高估真实性能，尽管在多大程度上可能发生遗骸仍然是未知的。本文调查了15现有技术，既良好的，最近有前途的结构学习算法的性能。我们提出一种方法，该方法将算法应用于包含合成噪声的数据，以便在应用于实际数据时更好地了解结构学习算法的性能。通过多种案例研究，样本尺寸，噪声类型测试每种算法，并评估多种评估标准。这项工作涉及学习大约10,000个图表，具有七个月的总结构学习运行时间。在调查数据噪声的影响时，我们提供了在不同数据噪声的不同假设下BN结构学习算法的第一个大规模实证比较。结果表明，传统的综合性能可能超过10％到50％的任何地方高估真实绩效。他们还表明，虽然基于得分的学习通常优于基于约束的学习，但更高的拟合得分并不一定意味着更准确的因果图。比较延伸到其他兴趣结果，例如运行时，可靠性和对噪声的抵御能力，评估小型和大型网络，以及有限和大数据。为了促进与未来研究的比较，我们已经在线自由提供了所有数据，原始结果，图形和BN型号。（c）2021提交人。 elsevier公司出版

著录项

来源
《International Journal of Approximate Reasoning》 |2021年第4期|151-188|共38页
作者
Constantinou Anthony C.; Liu Yang; Chobtham Kiattikun; Guo Zhigao; Kitson Neville K.;
展开▼
作者单位

Queen Mary Univ London QMUL Sch EECS Bayesian Artificial Intelligence Res Lab Risk & Informat Management RIM Res Grp London E1 4NS England|Alan Turing Inst British Lib 96 Euston Rd London NW1 2DB England;

Queen Mary Univ London QMUL Sch EECS Bayesian Artificial Intelligence Res Lab Risk & Informat Management RIM Res Grp London E1 4NS England;

Queen Mary Univ London QMUL Sch EECS Bayesian Artificial Intelligence Res Lab Risk & Informat Management RIM Res Grp London E1 4NS England;

Queen Mary Univ London QMUL Sch EECS Bayesian Artificial Intelligence Res Lab Risk & Informat Management RIM Res Grp London E1 4NS England;

Queen Mary Univ London QMUL Sch EECS Bayesian Artificial Intelligence Res Lab Risk & Informat Management RIM Res Grp London E1 4NS England;

展开▼
收录信息美国《科学引文索引》(SCI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Ancestral graphs; Causal discovery; Causal insufficiency; Directed acyclic graphs; Measurement error; Probabilistic graphical models;

机译：祖先图形;因果发现;因果不足;有向非循环图;测量误差;概率图形模型;

相似文献

外文文献
中文文献
专利

1. Improvement of CB & BC Algorithms (CB* Algorithm) for Learning Structure of Bayesian Networks as Classifier in Data Mining [J] . Benhard Sitohang, G. A. Putri Saptawati Journal of ICT Research and Applications . 2007,第1期

机译：贝叶斯网络作为数据挖掘分类器学习结构的CB和BC算法（CB *算法）的改进
2. A method of learning implication networks from empirical data: algorithm and Monte-Carlo simulation-based validation [J] . Jiming Liu, Desmarais M.C. IEEE Transactions on Knowledge and Data Engineering . 1997,第6期

机译：一种从经验数据中学习蕴涵网络的方法：算法和基于蒙特卡洛模拟的验证
3. A comparative analysis of Bayesian network structure learning algorithms applied to crime data [J] . Fazanaro Dalton Ieda, Pedrini Helio Intelligent data analysis . 2020,第4期

机译：贝叶斯网络结构学习算法应用于犯罪数据的比较分析
4. A method of learning implication networks from empirical data: algorithms and Monte Carlo simulation based validation [C] . Jiming Liu, Desmarais, M.C. . 1996

机译：一种从经验数据中学习蕴涵网络的方法：算法和基于蒙特卡洛模拟的验证
5. Techniques for incorporating data quality assessments into learning algorithms for Bayesian networks. [D] . Sessions, Valerie Kay. 2006

机译：将数据质量评估纳入贝叶斯网络学习算法的技术。
6. A Sparse Structure Learning Algorithm for Gaussian Bayesian Network Identification from High-Dimensional Data [O] . Shuai Huang, Jing Li, Jieping Ye, -1

机译：基于高维数据的高斯贝叶斯网络识别的稀疏结构学习算法
7. Improvement of CB & BC Algorithms (CB* Algorithm) for Learning Structure of Bayesian Networks as Classifier in Data Mining [O] . Benhard Sitohang, G. A. Putri Saptawati 2013

机译：数据挖掘中贝叶斯网络学习结构的CB和BC算法（CB *算法）的改进
8. Stochastic Algorithms for Learning with Incomplete Data: An Application to Bayesian Networks [R] . Myers, J. W. 1999

机译：不完全数据学习的随机算法：贝叶斯网络的一种应用

Large-scale empirical validation of Bayesian Network structure learning algorithms with noisy data

摘要

著录项

相似文献

相关主题

期刊订阅