首页> 外文会议>Privacy in statistical databases >Hybrid Microdata via Model-Based Clustering
【24h】

Hybrid Microdata via Model-Based Clustering

机译:通过基于模型的聚类混合微数据

获取原文
获取原文并翻译 | 示例

摘要

In this paper we propose a new scheme for statistical disclosure limitation which can be classified as a hybrid method of protection, that is, a method that combines properties of perturbative and synthetic methods. This approach is based on model-based clustering with the subsequent synthesis of the records within each cluster. The novelty is that the clustering and synthesis methods have been carefully chosen to fit each other in view of reducing information loss. The model-based clustering tries to obtain clusters such that the within-cluster data distribution is approximately normal; then we can use a multivariate normal synthesizer for the local synthesis of data. In this way, some of the non-normal characteristics of the data are captured by the clustering, so that a simple synthesizer for normal data can be used within each cluster. Our method is shown to be effective when compared to other disclosure limitation strategies.
机译:在本文中,我们提出了一种统计披露限制的新方案,可以将其归类为一种混合保护方法,即一种将摄动和合成方法的性质相结合的方法。该方法基于基于模型的聚类,随后对每个聚类中的记录进行综合。新颖之处在于,考虑到减少信息丢失,已经仔细选择了聚类和合成方法以相互适应。基于模型的聚类尝试获取聚类,以使聚类内数据分布近似正态。那么我们可以使用多元正态合成器进行数据的本地合成。这样,通过聚类捕获了数据的某些非正常特性,因此可以在每个聚类中使用简单的用于正常数据的合成器。与其他披露限制策略相比,我们的方法被证明是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号