首页> 外文会议>International Conference on Machine Learning >Adversarial Filters of Dataset Biases
【24h】

Adversarial Filters of Dataset Biases

机译:数据集偏差的对抗筛选器

获取原文

摘要

Large neural models have demonstrated human-level performance on language and vision benchmarks, while their performance degrades considerably on adversarial or out-of-distribution samples. This raises the question of whether these models have learned to solve a dataset rather than the underlying task by overfitting to spurious dataset biases. We investigate one recently proposed approach, AFLite, which adversarially filters such dataset biases, as a means to mitigate the prevalent overestimation of machine performance. We provide a theoretical understanding for AFLite, by situating it in the generalized framework for optimum bias reduction. We present extensive supporting evidence that AFLite is broadly applicable for reduction of measurable dataset biases, and that models trained on the filtered datasets yield better generalization to out-of-distribution tasks. Finally, filtering results in a large drop in model performance (e.g., from 92% to 62% for SNLI), while human performance still remains high. Our work thus shows that such filtered datasets can pose new research challenges for robust generalization by serving as upgraded benchmarks.
机译:大型神经模型在语言和视觉基准上表现出人类水平的性能,而他们的性能在对抗性或分发外样品上显着降低。这提出了这些模型是否学会了解数据集而不是通过过度抵消杂散数据集偏见来解决数据集的问题。我们调查了最近提出的方法,这是对抗的前面滤除这种数据集偏差,作为减轻机器性能的普遍存存的手段。我们通过在广义框架中出于最佳偏差减少的广义框架来提供对AFLITE的理论理解。我们呈现了广泛的支持证据,即aplite广泛适用于减少可测量的数据集偏差,并且在过滤的数据集上培训的模型会产生更好的推广到分销外任务。最后,过滤导致模型性能的大幅下降(例如,SNLI的92%至62%),而人类性能仍然仍然很高。因此,我们的工作表明,这种过滤的数据集可以通过用作升级基准来构成强大的泛化的新研究挑战。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号