首页> 外文期刊>Journal of the Royal Statistical Society >General and specific utility measures for synthetic data
【24h】

General and specific utility measures for synthetic data

机译:综合数据的通用和特定效用措施

获取原文
获取原文并翻译 | 示例
       

摘要

Data holders can produce synthetic versions of data sets when concerns about potential disclosure restrict the availability of the original records. The paper is concerned with methods to judge whether such synthetic data have a distribution that is comparable with that of the original data: what we term general utility. We consider how general utility compares with specific utility: the similarity of results of analyses from the synthetic data and the original data. We adapt a previous general measure of data utility, the propensity score mean-squared error pMSE, to the specific case of synthetic data and derive its distribution for the case when the correct synthesis model is used to create the synthetic data. Our asymptotic results are confirmed by a simulation study. We also consider two specific utility measures, confidence interval overlap and standardized difference in summary statistics, which we compare with the general utility results. We present two contrasting examples of data syntheses: one illustrating synthetic data that is evaluated as being useful by both general and specific measures and the second where neither is the case. For the second case we show how the general utility measures can identify the deficiencies of the synthetic data and suggest how this can inform possible improvements to the synthesis method.
机译:当对潜在披露的担忧限制了原始记录的可用性时,数据持有者可以生成数据集的综合版本。本文关注的是判断这些合成数据是否具有与原始数据可比的分布的方法:我们称之为通用。我们考虑了通用效用与特定效用的比较:合成数据和原始数据的分析结果的相似性。我们将数据效用的先前通用度量(倾向得分均方误差pMSE)调整为适合于合成数据的特定情况,并针对使用正确的合成模型创建合成数据时的情况得出其分布。模拟研究证实了我们的渐近结果。我们还考虑了两个特定的效用度量,即汇总统计中的置信区间重叠和标准化差异,并将它们与通用效用结果进行比较。我们提供了两个相反的数据合成示例:一个示例说明了通过一般和特定措施都被认为有用的合成数据,第二个示例都不是这种情况。对于第二种情况,我们展示了通用效用度量如何识别合成数据的不足之处,并建议这如何为合成方法提供可能的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号