首页> 外文期刊>Journal of statistical computation and simulation >DataSifter: statistical obfuscation of electronic health records and other sensitive datasets
【24h】

DataSifter: statistical obfuscation of electronic health records and other sensitive datasets

机译:DataSifter:电子健康记录和其他敏感数据集的统计混淆

获取原文
获取原文并翻译 | 示例

摘要

There are no practical and effective mechanisms to share high-dimensional data including sensitive information in various fields like health financial intelligence or socioeconomics without compromising either the utility of the data or exposing private personal or secure organizational information. Excessive scrambling or encoding of the information makes it less useful for modelling or analytical processing. Insufficient preprocessing may compromise sensitive information and introduce a substantial risk for re-identification of individuals by various stratification techniques. To address this problem, we developed a novel statistical obfuscation method (DataSifter) for on-the-fly de-identification of structured and unstructured sensitive high-dimensional data such as clinical data from electronic health records (EHR). DataSifter provides complete administrative control over the balance between risk of data re-identification and preservation of the data information. Simulation results suggest that DataSifter can provide privacy protection while maintaining data utility for different types of outcomes of interest. The application of DataSifter on a large autism dataset provides a realistic demonstration of its promise practical applications.
机译:与健康金融情报或社会经济等各种领域中的高维数据共享高维数据,包括在不同领域中的敏感信息,而不会影响数据的效用或揭露私人个人或安全组织信息。过度扰乱或对信息的编码使得对建模或分析处理不太有用。预处理不足可能会损害敏感信息,并引入通过各种分层技术重新识别个体的大量风险。为了解决这个问题,我们开发了一种新颖的统计混淆方法(DataSifter),用于在飞行中去除结构化和非结构化敏感的高维数据,例如来自电子健康记录(EHR)的临床数据。 DataSifer提供完全管理控制数据重新识别和保存数据信息的风险之间的平衡。仿真结果表明,DataSifer可以提供隐私保护,同时维持不同类型的感兴趣结果的数据实用性。 DataSifter在大型自闭症数据集上的应用提供了其承诺实际应用的实际演示。

著录项

  • 来源
  • 作者单位

    Univ Michigan Stat Online Computat Resource Ann Arbor MI 48109 USA;

    Univ Michigan Stat Online Computat Resource Ann Arbor MI 48109 USA|Univ Michigan Dept Biostat Ann Arbor MI 48109 USA;

    Univ Michigan Stat Online Computat Resource Ann Arbor MI 48109 USA;

    Univ Michigan Dept Biostat Ann Arbor MI 48109 USA;

    Univ Michigan Stat Online Computat Resource Ann Arbor MI 48109 USA;

    Univ Michigan Stat Online Computat Resource Ann Arbor MI 48109 USA|Univ Michigan Dept Hlth Behav & Biol Sci Ann Arbor MI 48109 USA|Univ Michigan Dept Computat Med & Bioinformat Ann Arbor MI 48109 USA|Univ Michigan Michigan Inst Data Sci Ann Arbor MI 48109 USA;

  • 收录信息 美国《科学引文索引》(SCI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Data sharing; personal privacy; information protection; Big Data; statistical method;

    机译:数据共享;个人隐私;信息保护;大数据;统计方法;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号