首页> 外文期刊>Expert systems with applications >Minimum training sample size requirements for achieving high prediction accuracy with the BN model: A case study regarding seismic liquefaction
【24h】

Minimum training sample size requirements for achieving high prediction accuracy with the BN model: A case study regarding seismic liquefaction

机译:使用BN模型实现高预测精度的最低训练样本大小:抗震液化的案例研究

获取原文
获取原文并翻译 | 示例

摘要

The complexity of a Bayesian network (BN) model and the number of training samples used have significant impacts on the prediction accuracy of the model regarding seismic liquefaction. The required training sample size for ensuring that a BN model has high generalization ability is a critical issue in parameter learning. To address this issue, this study analyses the relationship between the predictive performance of the BN model and the complexity of the model, training sample size, and average discrete intervals. Taking seismic liquefaction prediction as an example, 4536 statistical experiments are designed to investigate the training and testing performances of 21 different BN models under the conditions of nine different training sample size ratios (5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, and 80% of all data) and testing samples using 20% of all data. The results reveal that the learning performance of a BN is not sensitive to the training sample size but is related to the complexity of the model. The larger the sample size is, the stronger the generalization ability of the model. The minimum training sample requirements are related to the maximum in-degrees and the average discrete intervals, not the numbers of nodes and edges and the maximum out-degree of the BN structure. In addition, a modified structural entropy can characterize the complexity of a BN structure better than the existing structural entropy, but it has a worse relationship with the minimum training sample requirements than that of the maximum in-degree. To quickly determine the minimum training sample size requirements of a BN model with a predictive accuracy of 80%, a fitting function that considers the effects of the maximum in-degree and the average discrete intervals is presented, and its effectiveness is validated by two examples.
机译:贝叶斯网络(BN)模型的复杂性和所使用的训练样本的数量对关于地震液化模型的预测准确性的显着影响。确保BN模型具有高泛化能力的所需培训样本大小是参数学习中的重要问题。为了解决这个问题,本研究分析了BN模型的预测性能与模型的复杂性,训练样本大小和平均离散间隔之间的关系。以地震液化预测为例,4536个统计实验旨在调查在九种不同训练样本尺寸比率下的21种不同BN模型的训练和测试性能(5%,10%,20%,30%,40% 50%,60%,70%和80%的数据)和使用所有数据的20%的测试样本。结果表明,BN的学习性能对训练样本大小不敏感,但与模型的复杂性有关。样品大小越大,模型的泛化能力越强。最小训练样本要求与最大程度上和平均离散间隔相关,而不是BN结构的节点和边的数量和最大程度。此外,修改的结构熵可以表征比现有的结构熵更好的BN结构的复杂性,但它与最小训练样本要求的关系比最大程度的程度更差。为了快速确定BN模型的最小训练样本尺寸要求,预测精度为80%,呈现了拟合函数的拟合功能,并提出了最大程度的效果和平均离散间隔,其有效性由两个示例验证。

著录项

  • 来源
    《Expert systems with applications》 |2021年第12期|115702.1-115702.13|共13页
  • 作者单位

    China Three Gorges Univ Coll Civil Engn & Architecture Yichang 443002 Hubei Peoples R China|China Three Gorges Univ Key Lab Geol Hazards Three Gorges Reservoir Area Yichang 443002 Hubei Peoples R China;

    China Three Gorges Univ Med Coll Yichang 443002 Hubei Peoples R China;

    China Three Gorges Univ Coll Civil Engn & Architecture Yichang 443002 Hubei Peoples R China;

    China Three Gorges Univ Coll Civil Engn & Architecture Yichang 443002 Hubei Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Seismic liquefaction; Bayesian network; Sample size; Complexity; Predictive performance;

    机译:地震液化;贝叶斯网络;样本大小;复杂性;预测性能;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号