...
首页> 外文期刊>International Journal of Innovative Computing Information and Control >PUBLISHING SENSITIVE TIME-SERIES DATA UNDER PRESERVATION OF PRIVACY AND DISTANCE ORDERS
【24h】

PUBLISHING SENSITIVE TIME-SERIES DATA UNDER PRESERVATION OF PRIVACY AND DISTANCE ORDERS

机译:在保留私密性和远程性顺序的情况下发布敏感的时间序列数据

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we address the problem of preserving mining accuracy as well as privacy in publishing sensitive time-series data. For example, people with heart disease do not want to disclose their ECG time-series, but they still allow mining some accurate patterns from their time-series. Our privacy model assumes that (1) data sources publish their time-series independently, and (2) all information used in publishing time-series can be publicly revealed. Based on this model, we introduce three assumptions: full disclosure, equi-uncertainty, and independency. We also derive two requirements: uncertainty preservation and distance order preservation. We show that only randomization methods satisfy all three assumptions, but even those methods do not satisfy both the requirements. Thus, we discuss the randomization-based solutions that satisfy all assumptions and requirements. For this purpose, we present a novel notion of the noise averaging effect of piecewise aggregate approximation (PAA), which is derived from a simple intuition that the summation of random noise converges to 0. This noise averaging effect can alleviate the problem of destroying distance orders in randomly perturbed time-series. Based on the noise averaging effect, we first propose two naive solutions that use the random data perturbation in publishing time-series while exploiting the PAA distance in computing distances. There is, however, a tradeoff between these two solutions with respect to uncertainty and distance orders. We thus propose three more advanced solutions that take advantages of both naive solutions. Experimental results show that our advanced solutions are superior to the naive solutions in the preservation of uncertainty, distance orders, and clustering accuracy.
机译:在本文中,我们解决了在发布敏感的时间序列数据时保留挖掘准确性以及隐私的问题。例如,患有心脏病的人不想透露其心电图时间序列,但是他们仍然允许从其时间序列中挖掘出一些准确的模式。我们的隐私模型假设(1)数据源独立发布其时间序列,并且(2)可以公开披露用于发布时间序列的所有信息。基于此模型,我们引入三个假设:完全披露,等同不确定性和独立性。我们还得出两个要求:不确定性保留和距离顺序保留。我们证明只有随机化方法才能满足所有三个假设,但即使是那些方法也不能满足这两个要求。因此,我们讨论了满足所有假设和要求的基于随机化的解决方案。为此,我们提出了一种新的概念,即分段聚合近似(PAA)的平均噪声效果,它是根据一个简单的直觉得出的,即随机噪声的总和收敛为0。这种平均噪声的效果可以缓解距离破坏的问题在随机扰动的时间序列中排序。基于噪声平均效果,我们首先提出两种幼稚的解决方案,它们在发布时间序列时使用随机数据扰动,同时在计算距离时利用PAA距离。但是,这两种解决方案在不确定性和距离顺序方面需要权衡。因此,我们提出了三个更高级的解决方案,它们都利用了这两种幼稚的解决方案。实验结果表明,在保留不确定性,距离顺序和聚类精度方面,我们的高级解决方案优于天真的解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号