首页> 外文会议>Conference on Neural Information Processing Systems >Model Similarity Mitigates Test Set Overuse
【24h】

Model Similarity Mitigates Test Set Overuse

机译:模型相似性减轻测试集过度使用

获取原文
获取外文期刊封面目录资料

摘要

Excessive reuse of test data has become commonplace in today's machine learning workflows. Popular benchmarks, competitions, industrial scale tuning, among other applications, all involve test data reuse beyond guidance by statistical confidence bounds. Nonetheless, recent replication studies give evidence that popular benchmarks continue to support progress despite years of extensive reuse. We proffer a new explanation for the apparent longevity of test data: Many proposed models are similar in their predictions and we prove that this similarity mitigates overfitting. Specifically, we show empirically that models proposed for the ImageNet ILSVRC benchmark agree in their predictions well beyond what we can conclude from their accuracy levels alone. Likewise, models created by large scale hyperparameter search enjoy high levels of similarity. Motivated by these empirical observations, we give a non-asymptotic generalization bound that takes similarity into account, leading to meaningful confidence bounds in practical settings.
机译:在当今的机器学习工作流程中,过度重复使用测试数据已变得普遍。在其他应用中,流行的基准测试,竞争,工业规模调整,所有这些应用程序都涉及通过统计置信度限制来测试数据重用超出指导。尽管如此,最近的复制研究证明了尽管多年来广泛的重用,但流行的基准继续支持进步。我们为测试数据的明显寿命提供了新的解释:许多拟议的模型在他们的预测中类似,我们证明了这种相似性减轻过度装备。具体而言,我们凭经验展示了针对Imagenet ILSVRC基准的模型在他们的预测中同意,远远超出了我们可以单独从其准确度得出的。同样,由大规模的覆盖物搜索创建的模型享有高水平的相似性。通过这些经验观察,我们给出了一个非渐近概括的界限,这些界限考虑了相似,导致实际设置中有意义的置信界限。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号