首页> 外文会议>International Conference on Medical Image Computing and Computer-Assisted Intervention >Have You Forgotten? A Method to Assess if Machine Learning Models Have Forgotten Data
【24h】

Have You Forgotten? A Method to Assess if Machine Learning Models Have Forgotten Data

机译:你忘记了吗?评估机器学习模型是否忘记数据的方法

获取原文

摘要

In the era of deep learning, aggregation of data from several sources is a common approach to ensuring data diversity. Let us consider a scenario where several providers contribute data to a consortium for the joint development of a classification model (hereafter the target model), but, now one of the providers decides to leave. This provider requests that their data (hereafter the query dataset) be removed from the databases but also that the model 'forgets' their data. In this paper, for the first time, we want to address the challenging question of whether data have been forgotten by a model. We assume knowledge of the query dataset and the distribution of a model's output. We establish statistical methods that compare the target's outputs with outputs of models trained with different datasets. We evaluate our approach on several benchmark datasets (MNIST, CIFAR-10 and SVHN) and on a cardiac pathology diagnosis task using data from the Automated Cardiac Diagnosis Challenge (ACDC). We hope to encourage studies on what information a model retains and inspire extensions in more complex settings.
机译:在深度学习的时代,来自几个来源的数据的聚合是确保数据分集的常见方法。让我们考虑一个场景,其中几个提供商为联盟提供数据以进行分类模型的联合开发(以下目标模型),但现在其中一个提供者决定离开。该提供商请求从数据库中删除其数据(以下,查询数据集),但模型“忘记”其数据。在本文中,我们首次想要解决数据是否被模型忘记的具有挑战性的问题。我们假设查询数据集的知识和模型输出的分布。建立统计方法,比较目标的输出与使用不同数据集培训的模型的输出。我们在几个基准数据集(MNIST,CIFAR-10和SVHN)上和使用来自自动心脏诊断挑战(ACDC)的数据的心脏病理诊断任务进行评估。我们希望鼓励研究模型在更复杂的设置中保留和激发扩展的信息的研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号