首页> 外文期刊>Reliability Engineering & System Safety >Cross validation for the classical model of structured expert judgment
【24h】

Cross validation for the classical model of structured expert judgment

机译:对结构化专家判断的经典模型进行交叉验证

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

We update the 2008 TU Delft structured expert judgment database with data from 33 professionally contracted Classical Model studies conducted between 2006 and March 2015 to evaluate its performance relative to other expert aggregation models. We briefly review alternative mathematical aggregation schemes, including harmonic weighting, before focusing on linear pooling of expert judgments with equal weights and performance-based weights. Performance weighting outperforms equal weighting in all but 1 of the 33 studies in-sample. True out-of-sample validation is rarely possible for Classical Model studies, and cross validation techniques that split calibration questions into a training and test set are used instead. Performance weighting incurs an "out-of-sample penalty" and its statistical accuracy out-of-sample is lower than that of equal weighting. However, as a function of training set size, the statistical accuracy of performance-based combinations reaches 75% of the equal weight value when the training set includes 80% of calibration variables. At this point the training set is sufficiently powerful to resolve differences in individual expert performance. The information of performance-based combinations is double that of equal weighting when the training set is at least 50% of the set of calibration variables. Previous out-of-sample validation work used a Total Out-of-Sample Validity Index based on all splits of the calibration questions into training and test subsets, which is expensive to compute and includes small training sets of dubious value. As an alternative, we propose an Out-of-Sample Validity Index based on averaging the product of statistical accuracy and information over all training sets sized at 80% of the calibration set. Performance weighting outperforms equal weighting on this Out-of-Sample Validity Index in 26 of the 33 post-2006 studies; the probability of 26 or more successes on 33 trials if there were no difference between performance weighting and equal weighting is 0.001.
机译:我们使用2006年至2015年3月之间进行的33项专业承包的古典模型研究数据更新了2008 TU Delft结构化专家判断数据库,以评估其相对于其他专家汇总模型的绩效。在集中讨论具有相等权重和基于性能的权重的专家判断的线性合并之前,我们简要回顾了包括谐波加权在内的其他数学聚合方案。在33个样本研究中,只有1个研究的绩效加权优于其他加权。对于古典模型研究,极不可能进行真正的样本外验证,而是使用将校准问题分为训练和测试集的交叉验证技术。性能加权会产生“样本外损失”,其统计样本外准确性低于相等加权的统计准确性。但是,作为训练集大小的函数,当训练集包含80%的校准变量时,基于性能的组合的统计准确性达到相等权重值的75%。在这一点上,训练集足够强大,可以解决各个专家表现的差异。当训练集至少是校准变量集的50%时,基于性能的组合的信息是相等权重的两倍。以前的样本外验证工作基于对校准问题的所有拆分(分为训练和测试子集)使用了总样本外有效性指数,这计算起来很昂贵,并且包含小的可疑值训练集。作为替代方案,我们建议根据所有训练集的统计准确度和信息乘积的平均值计算出样本外有效性指数,该训练集的大小应为校准集的80%。在2006年后的33项研究中,有26项的绩效超额有效性指标胜过同等有效性指标;如果绩效权重和相等权重之间没有差异,则33个试验中26次或更多成功的概率为0.001。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号