Privacy-Preserving Record Linkage with Spark

机译：Spark的隐私保护记录链接

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Privacy considerations obligate careful and secure processing of personal data. This is especially true when personal data is linked against databases from other organizations. During such endeavours, privacy-preserving record linkage (PPRL) can be utilized to prevent needless exposure of sensitive information to other organizations. With the increase of personal data that is being gathered and analyzed, scalable PPRL capable of handling massive databases is much desired. In this work, we evaluate Apache Spark as an option to scale PPRL. Not only is it valuable to have a scalable PPRL implementation, but one based on the Spark would also be commonly deployable and could take advantage of further development of the ecosystem. Our results show that a PPRL solution based on Spark outperforms alternatives when it comes to handling multiple millions of records; can scale to dozens of nodes; and is on-par with regular record linkage implementations in terms of achieved results.

机译：出于隐私方面的考虑，必须谨慎，安全地处理个人数据。当个人数据与其他组织的数据库链接时，尤其如此。在这种努力中，可以使用隐私保护记录链接（PPRL）来防止敏感信息不必要地暴露给其他组织。随着收集和分析的个人数据的增加，人们迫切希望能够处理大型数据库的可扩展PPRL。在这项工作中，我们将Apache Spark评估为可扩展PPRL的选项。拥有可扩展的PPRL实现不仅很有价值，而且基于Spark的实现也可以普遍部署，并且可以利用生态系统的进一步发展。我们的结果表明，在处理数百万条记录时，基于Spark的PPRL解决方案优于其他方法。可以扩展到数十个节点;并且在取得的成果方面与常规的记录链接实施相当。

著录项

来源
《IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing》|2019年|440-448|共9页
会议地点
作者
Onno Valkering; Adam Belloum;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
data privacy; database management systems;

机译：数据隐私;数据库管理系统;

相似文献

外文文献
中文文献
专利

1. Blockchain-based Privacy-Preserving Record Linkage: enhancing data privacy in an untrusted environment [J] . Nobrega Thiago, Pires Carlos Eduardo S., Nascimento Dimas Cassimiro Information Systems . 2021,第Deca期

机译：基于区块链的隐私保留记录链接：在不受信任的环境中增强数据隐私
2. Optimization of the Mainzelliste software for fast privacy-preserving record linkage [J] . Florens Rohde, Martin Franke, Ziad Sehili, Journal of Translational Medicine . 2021,第1期

机译：优化Mainzelliste软件，用于快速隐私保留记录链接
3. Incremental clustering techniques for multi-party Privacy-Preserving Record Linkage [J] . Vatsalan Dinusha, Christen Peter, Rahm Erhard Data & Knowledge Engineering . 2020,第Jula期

机译：多方隐私保留记录链接的增量聚类技术
4. Privacy-Preserving Record Linkage with Spark [C] . Onno Valkering, Adam Belloum IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing . 2019

机译：Privacy-Presting Complase与Spark
5. A Scalable Blocking Framework for Multidatabase Privacy-preserving Record Linkage [D] . Ranbaduge, Thilina. 2018

机译：多数据库隐私保护记录链接的可扩展阻止框架
6. Optimization of the Mainzelliste software for fast privacy-preserving record linkage [O] . Florens Rohde, Martin Franke, Ziad Sehili, 2021

机译：优化Mainzelliste软件以快速隐私保留记录链接
7. Optimization of the Mainzelliste software for fast privacy-preserving record linkage [O] . Florens Rohde, Martin Franke, Ziad Sehili, 2021

机译：优化Mainzelliste软件，以快速隐私保留记录链接

Privacy-Preserving Record Linkage with Spark

摘要

著录项

相似文献

相关主题

期刊订阅