首页> 外文期刊>Journal of the Royal Statistical Society >Bayesian non-parametric generation of fully synthetic multivariate categorical data in the presence of structural zeros
【24h】

Bayesian non-parametric generation of fully synthetic multivariate categorical data in the presence of structural zeros

机译:存在结构零的贝叶斯非参数全合成多元分类数据生成

获取原文
获取原文并翻译 | 示例
       

摘要

Statistical agencies are increasingly adopting synthetic data methods for disseminating microdata without compromising the privacy of respondents. Crucial to the implementation of these approaches are flexible models, able to capture the nuances of the multivariate structure in the original data. In the case of multivariate categorical data, preserving this multivariate structure also often involves satisfying constraints in the form of combinations of responses that cannot logically be present in any data set-like married toddlers or pregnant men-also known as structural zeros. Ignoring structural zeros can result in both logically inconsistent synthetic data and biased estimates. Here we propose the use of a Bayesian non-parametric method for generating discrete multivariate synthetic data subject to structural zeros. This method can preserve complex multivariate relationships between variables, can be applied to high dimensional data sets with massive collections of structural zeros, requires minimal tuning from the user and is computationally efficient. We demonstrate our approach by synthesizing an extract of 17 variables from the 2000 US census. Our method produces synthetic samples with high analytic utility and low disclosure risk.
机译:统计机构越来越多地采用合成数据方法来传播微数据,而又不损害受访者的隐私。灵活的模型对于实现这些方法至关重要,该模型能够捕获原始数据中多元结构的细微差别。在多元分类数据的情况下,保留此多元结构通常还涉及满足满足条件的约束,这些响应的组合形式在任何数据集(如已婚婴儿或孕妇)中都无法逻辑地存在,也称为结构零。忽略结构零可能会导致逻辑上不一致的合成数据和有偏差的估计。在这里,我们建议使用贝叶斯非参数方法来生成离散零变量合成数据,该数据受结构零点的影响。该方法可以保留变量之间的复杂多元关系,可以将其应用于具有大量结构零位的高维数据集,需要用户进行最少的调整并且计算效率高。我们通过综合从2000年美国人口普查中提取的17个变量来证明我们的方法。我们的方法生产的合成样品具有很高的分析效用和较低的披露风险。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号