...
首页> 外文期刊>The American Journal of Human Genetics >Privacy Risks from Genomic Data-Sharing Beacons
【24h】

Privacy Risks from Genomic Data-Sharing Beacons

机译:基因组数据共享信标带来的隐私风险

获取原文
获取原文并翻译 | 示例

摘要

The human genetics community needs robust protocols that enable secure sharing of genomic data from participants in genetic research. Beacons are web servers that answer allele-presence queries such as "Do you have a genome that has a specific nucleotide (e.g., A) at a specific genomic position (e.g., position 11,272 on chromosome 1)?"-with either "yes" or "no." Here, we show that individuals in a beacon are susceptible to re-identification even if the only data shared include presence or absence information about alleles in a beacon. Specifically, we propose a likelihood-ratio test of whether a given individual is present in a given genetic beacon. Our test is not dependent on allele frequencies and is the most powerful test for a specified false-positive rate. Through simulations, we showed that in a beacon with 1,000 individuals, re-identification is possible with just 5,000 queries. Relatives can also be identified in the beacon. Re-identification is possible even in the presence of sequencing errors and variant-calling differences. In a beacon constructed with 65 European individuals from the 1000 Genomes Project, we demonstrated that it is possible to detect membership in the beacon with just 250 SNPs. With just 1,000 SNP queries, we were able to detect the presence of an individual genome from the Personal Genome Project in an existing beacon. Our results show that beacons can disclose membership and implied phenotypic information about participants and do not protect privacy a priori. We discuss risk mitigation through policies and standards such as not allowing anonymous pings of genetic beacons and requiring minimum beacon sizes.
机译:人类遗传学界需要可靠的协议,以确保安全共享基因研究参与者的基因组数据。信标是回答等位基因存在查询的Web服务器,例如“您是否具有在特定基因组位置(例如,染色体1上的11,272位)具有特定核苷酸(例如A)的基因组?”-带有“是”或者没有。”在这里,我们表明,即使共享的唯一数据包括有关信标中等位基因的存在或不存在信息,信标中的个人也容易受到重新识别的影响。具体而言,我们提出了在给定遗传信标中是否存在给定个体的似然比测试。我们的测试不依赖于等位基因频率,对于指定的假阳性率是最有力的测试。通过模拟,我们表明,在一个拥有1000个人的信标中,仅需5,000个查询就可以重新识别。亲戚也可以在信标中识别。即使存在测序错误和变异调用差异,也可以重新识别。在由1000个基因组计划的65名欧洲人建造的信标中,我们证明了仅使用250个SNP就能检测到信标中的成员身份。仅需1,000个SNP查询,我们就可以从现有信标中的Personal Genome Project中检测出单个基因组的存在。我们的结果表明,信标可以披露成员资格和与参与者有关的隐式表型信息,并且不能事先保护隐私。我们讨论通过政策和标准来降低风险,例如,不允许匿名对遗传信标进行ping操作并要求最小信标大小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号