首页> 外文期刊>IBM Journal of Research and Development >IPV: A system for identifying privacy vulnerabilities in datasets
【24h】

IPV: A system for identifying privacy vulnerabilities in datasets

机译:IPV:一种用于识别数据集中的隐私漏洞的系统

获取原文
获取原文并翻译 | 示例
           

摘要

The automated discovery of privacy vulnerabilities in large datasets containing person-specific information is an important first step in the privacy-preserving data publishing process and an area of increased interest for commercial data masking offerings. In this paper, we describe Identification of Privacy Vulnerabilities (IPV), a scalable system for automatically analyzing datasets to expose privacy vulnerabilities. IPV provides data owners with a wealth of methods for analyzing their data by offering state-of-the-art algorithms for 1) computing the direct identifiers and the quasi-identifiers of a dataset, as the single attributes and the minimal combinations of attributes, respectively, that lead to few records; 2) calculating the vulnerability index associated with a dataset, by reporting the cardinality of the smallest group of records that share the same values for each combination of attributes; and 3) reporting the specific records in a dataset that contain a combination of unique or rare values. All of these algorithms operate in a parallel, massively multi-threaded fashion and support various hardware configurations, spanning from commodity machines to multi-CPU multi-core nodes in cluster environments. After describing the system, we discuss the algorithms that are currently supported by IPV and provide some examples of their workings. We conclude this paper with a discussion on promising directions for future research in this area that will lead to the improvement of IPV.
机译:在包含特定于人的信息的大型数据集中自动发现隐私漏洞是隐私保护数据发布过程中重要的第一步,也是对商业数据屏蔽产品越来越感兴趣的领域。在本文中,我们描述了隐私漏洞识别(IPV),这是一个可扩展的系统,用于自动分析数据集以暴露隐私漏洞。 IPV通过提供用于1)计算数据集的直接标识符和准标识符(作为单个属性和属性的最小组合)的最新算法,为数据所有者提供了多种分析数据的方法,分别导致很少的记录; 2)通过报告最小化的一组记录的基数来计算与数据集相关联的漏洞指数,这些记录的每个属性组合都具有相同的值; 3)报告数据集中包含唯一或稀有值组合的特定记录。所有这些算法都以并行,大规模多线程的方式运行,并支持各种硬件配置,从商用机器到集群环境中的多CPU多核节点。在描述系统之后,我们将讨论IPV当前支持的算法,并提供其工作示例。在本文的结尾,我们讨论了该领域未来研究的有希望的方向,这将导致IPV的改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号