Applying Deep Learning to Preserve Data Confidentiality Keynote Address

机译：应用深度学习保留数据机密性主题演讲

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Preserving data confidentiality is a crucial problem when releasing microdata for public-use. A lot of approaches have been proposed so far for preserving data confidentiality, and many of them are based on traditional probability and statistics which have the capability to mask the original data. However, their performance needs to be significantly improved in practice. In this paper, we approached this problem by using deep learning-based generative model, which can generate simulated data that are closely related to raw data but different for each item. Since the mechanism of generative model is to transform a distribution (like Uniform) sampled from a noise to another distribution (like Gaussian) sampled from a real dataset, it is hard to guarantee such generation that can represent the raw data in practice due to existing statistical variants between them. Despite deep learning's strong generative ability, the same issue still exists. In this study, we innovatively explore statistical similarity between two datasets via deep learning-based generative model. And we also introduced two statistical evaluation metrics to assess the similarity. We conducted extensive experiments to validate our idea with two real-world datasets, the census dataset and the environmental dataset.

机译：在发布微数据供公众使用时，保护数据机密性是一个关键问题。迄今为止，已经提出了许多方法来保护数据的机密性，其中许多方法是基于传统的概率和统计信息的，具有掩盖原始数据的能力。但是，它们的性能在实践中需要大大提高。在本文中，我们通过使用基于深度学习的生成模型来解决此问题，该模型可以生成与原始数据密切相关但每个项目都不同的模拟数据。由于生成模型的机制是将从噪声采样的分布（如均匀）转换为从真实数据集采样的另一分布（如高斯分布），因此由于存在的原因，很难保证这种生成在实际中可以表示原始数据它们之间的统计差异。尽管深度学习具有强大的生成能力，但仍然存在相同的问题。在这项研究中，我们通过基于深度学习的生成模型创新性地探索了两个数据集之间的统计相似性。并且我们还引入了两个统计评估指标来评估相似性。我们进行了广泛的实验，以使用两个实际数据集（普查数据集和环境数据集）验证我们的想法。

著录项

来源
《IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing》|2018年|3-3|共1页
会议地点
作者
Xiaohi Cui;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Machine learning; Software engineering; Data models; Probability; Transforms; Measurement; Bibliographies;

机译：机器学习;软件工程;数据模型;概率;变换;测量;书目;

相似文献

外文文献
中文文献
专利

1. Using deep learning to preserve data confidentiality [J] . Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2020,第2期

机译：利用深度学习保留数据机密性
2. Disclosure Control of Confidential Data by Applying PAC Learning Theory [J] . Ling He, Haldun Aytug, Gary J. Koehler Journal of database management . 2010,第4期

机译：应用PAC学习理论的机密数据披露控制。
3. Automatic Pulmonary Nodule Detection Applying Deep Learning or Machine Learning Algorithms to the LIDC-IDRI Database: A Systematic Review [J] . Lea Marie Pehrson, Michael Bachmann Nielsen, Carsten Ammitzb?l Lauridsen Diagnostics . 2019,第1期

机译：将深度学习或机器学习算法应用于LIDC-IDRI数据库的自动肺结节检测：系统综述
4. Applying Deep Learning to Preserve Data Confidentiality Keynote Address [C] . Xiaohi Cui IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing . 2018

机译：应用深度学习保存数据机密性主题演讲
5. Disclosure control of confidential data by applying PAC learning theory. [D] . He, Ling. 2005

机译：通过应用PAC学习理论对机密数据进行披露控制。
6. Automatic Pulmonary Nodule Detection Applying Deep Learning or Machine Learning Algorithms to the LIDC-IDRI Database: A Systematic Review [O] . Lea Marie Pehrson, Michael Bachmann Nielsen, Carsten Ammitzbøl Lauridsen 2019

机译：将深度学习或机器学习算法应用于LIDC-IDRI数据库的自动肺结节检测：系统综述
7. Keynote Speech 3: Direct Error Driven Deep Learning for Bigdata Classification [O] . 2021

机译：主题演讲3：直接错误驱动BigData分类的深度学习

Applying Deep Learning to Preserve Data Confidentiality Keynote Address

摘要

著录项

相似文献

相关主题

期刊订阅