首页> 外文期刊>IEEE transactions on information technology in biomedicine >A Cryptographic Approach to Securely Share and Query Genomic Sequences
【24h】

A Cryptographic Approach to Securely Share and Query Genomic Sequences

机译:安全共享和查询基因组序列的密码学方法

获取原文
获取原文并翻译 | 示例
           

摘要

To support large-scale biomedical research projects, organizations need to share person-specific genomic sequences without violating the privacy of their data subjects. In the past, organizations protected subjects'' identities by removing identifiers, such as name and social security number; however, recent investigations illustrate that deidentified genomic data can be ldquoreidentifiedrdquo to named individuals using simple automated methods. In this paper, we present a novel cryptographic framework that enables organizations to support genomic data mining without disclosing the raw genomic sequences. Organizations contribute encrypted genomic sequence records into a centralized repository, where the administrator can perform queries, such as frequency counts, without decrypting the data. We evaluate the efficiency of our framework with existing databases of single nucleotide polymorphism (SNP) sequences and demonstrate that the time needed to complete count queries is feasible for real world applications. For example, our experiments indicate that a count query over 40 SNPs in a database of 5000 records can be completed in approximately 30 min with off-the-shelf technology. We further show that approximation strategies can be applied to significantly speed up query execution times with minimal loss in accuracy. The framework can be implemented on top of existing information and network technologies in biomedical environments.
机译:为了支持大规模的生物医学研究项目,组织需要共享特定于人的基因组序列,而又不损害其数据主体的隐私。过去,组织通过删除标识符(例如姓名和社会保险号)来保护主体的身份;但是,最近的研究表明,可以使用简单的自动化方法将已识别的基因组数据“重命名”为命名的个体。在本文中,我们提出了一种新颖的密码框架,该框架使组织能够支持基因组数据挖掘而无需透露原始基因组序列。组织将加密的基因组序列记录提交到集中式存储库中,管理员可以在其中执行查询(例如频率计数),而无需解密数据。我们使用现有的单核苷酸多态性(SNP)序列数据库评估了我们框架的效率,并证明了完成计数查询所需的时间对于现实应用是可行的。例如,我们的实验表明,使用现成的技术,可以在大约30分钟内完成对5000条记录的数据库中40个SNP的计数查询。我们进一步证明,近似策略可以应用于以最小的准确性损失来显着加快查询执行时间。该框架可以在生物医学环境中的现有信息和网络技术之上实施。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号