
Privacy Parameter Variation Using RAPPOR on a Malware Dataset


获取原文并翻译 | 示例


Stricter data protection regulations and the poor application of privacy protection techniques have resulted in a requirement for data-driven companies to adopt new methods of analysing sensitive user data. The RAPPOR (Randomized Aggregatable Privacy-Preserving Ordinal Response) method adds parameterised noise, which must be carefully selected to maintain adequate privacy without losing analytical value. This paper applies RAPPOR privacy parameter variations against a public dataset containing a list of running Android applications data. The dataset is filtered and sampled into small (10,000); medium (100,000); and large (1,200,000) sample sizes while applying RAPPOR with ? = 10; 1.0; and 0.1 (respectively low; medium; high privacy guarantees). Also, in order to observe detailed variations within high to medium privacy guarantees (? = 0.5 to 1.0), a second experiment is conducted by progressively adjusting the value of ? over the same populations. The first experiment verifies the original RAPPOR studies using ? = 1 with a non-existent recoverability in the small sample size, and detectable signal from medium to large sample sizes as also demonstrated in the original RAPPOR paper. Further results, using high privacy guarantees, show that the large sample size, in contrast to medium, suffers 2.75 times more in terms of recoverability when increasing privacy guarantees from ? = 1.0 to 0.8. Overall, the paper demonstrates that high privacy guarantees to restrict the analysis only to the most dominating strings.
机译:严格的数据保护法规和隐私保护技术的不良应用导致数据驱动型公司要求采用新方法来分析敏感用户数据。 RAPPOR(随机化的,可汇总的隐私保护顺序响应)方法添加了参数化噪声,必须仔细选择该噪声,以保持足够的隐私权而又不失去分析价值。本文针对包含运行中的Android应用程序数据列表的公共数据集应用RAPPOR隐私参数变化。将数据集过滤并采样为小样本(10,000);中(100,000);和大型(1,200,000)样本大小,同时将RAPPOR与? = 10; 1.0;和0.1(分别为低,中,高隐私保证)。另外,为了观察在高到中等的隐私保证(α= 0.5到1.0)内的详细变化,通过逐渐地调整α的值来进行第二实验。在相同的人口。第一个实验使用?验证了原始的RAPPOR研究。 = 1,在小样本量中不存在可恢复性,并且从原始样本到大样本量都可检测到信号,这在原始RAPPOR论文中也得到了证明。使用高隐私保证的进一步结果表明,与中等规模相比,大样本量的可恢复性遭受的损失要大2.75倍。 = 1.0至0.8。总体而言,本文证明了高度的隐私保证将分析仅限于最主要的字符串。



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号