...
首页> 外文期刊>BMC Medical Genomics >Controlling the signal: Practical privacy protection of genomic data sharing through Beacon services
【24h】

Controlling the signal: Practical privacy protection of genomic data sharing through Beacon services

机译:控制信号:通过信标服务共享基因组数据的实用隐私保护

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background Genomic data is increasingly collected by a wide array of organizations. As such, there is a growing demand to make summary information about such collections available more widely. However, over the past decade, a series of investigations have shown that attacks, rooted in statistical inference methods, can be applied to discern the presence of a known individual’s DNA sequence in the pool of subjects. Recently, it was shown that the Beacon Project of the Global Alliance for Genomics and Health, a web service for querying about the presence (or absence) of a specific allele, was vulnerable. The Integrating Data for Analysis, Anonymization, and Sharing (iDASH) Center modeled a track in their third Privacy Protection Challenge on how to mitigate the Beacon vulnerability. We developed the winning solution for this track. Methods This paper describes our computational method to optimize the tradeoff between the utility and the privacy of the Beacon service. We generalize the genomic data sharing problem beyond that which was introduced in the iDASH Challenge to be more representative of real world scenarios to allow for a more comprehensive evaluation. We then conduct a sensitivity analysis of our method with respect to several state-of-the-art methods using a dataset of 400,000 positions in Chromosome 10 for 500 individuals from Phase 3 of the 1000 Genomes Project. All methods are evaluated for utility, privacy and efficiency. Results Our method achieves better performance than all state-of-the-art methods, irrespective of how key factors (e.g., the allele frequency in the population, the size of the pool and utility weights) change from the original parameters of the problem. We further illustrate that it is possible for our method to exhibit subpar performance under special cases of allele query sequences. However, we show our method can be extended to address this issue when the query sequence is fixed and known a priori to the data custodian, so that they may plan stage their responses accordingly. Conclusions This research shows that it is possible to thwart the attack on Beacon services, without substantially altering the utility of the system, using computational methods. The method we initially developed is limited by the design of the scenario and evaluation protocol for the iDASH Challenge; however, it can be improved by allowing the data custodian to act in a staged manner.
机译:背景基因组数据越来越多地被众多组织收集。因此,越来越需要使有关此类收藏的摘要信息更广泛地可用。但是,在过去的十年中,一系列研究表明,以统计推断方法为基础的攻击可以用于识别受试者库中已知个体DNA序列的存在。最近,有证据表明,全球基因组与健康联盟的信标项目是一个易受攻击的网络服务,该网络服务用于查询特定等位基因的存在(或不存在)。集成数据分析,匿名和共享(iDASH)中心在其第三次“隐私保护挑战”中为如何缓解信标漏洞建模了一条轨迹。我们为该曲目开发了获奖的解决方案。方法本文介绍了我们的计算方法,以优化效用和信标服务隐私之间的权衡。除了对iDASH挑战赛中引入的问题之外,我们对基因组数据共享问题进行了概括,以更真实地反映现实情况,以便进行更全面的评估。然后,我们使用1000个基因组计划第3阶段的500个人的10个染色体中400,000个位置的数据集,对几种最新方法进行了敏感性分析。评估所有方法的效用,隐私和效率。结果无论关键因素(例如人群中的等位基因频率,池的大小和效用权重)如何与问题的原始参数发生变化,我们的方法均比所有现有技术具有更好的性能。我们进一步说明,在等位基因查询序列的特殊情况下,我们的方法有可能表现出低于标准的性能。但是,我们显示了当查询序列固定且数据托管人先验地知道时,可以扩展我们的方法来解决此问题,以便他们可以计划相应的阶段响应。结论这项研究表明,使用计算方法可以在不大幅改变系统实用性的情况下,阻止对信标服务的攻击。我们最初开发的方法受到iDASH挑战方案和评估协议设计的限制;但是,可以通过允许数据托管人分阶段进行操作来改进它。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号