Toward sensitive document release with privacy guarantees

David Sánchez; Montserrat Batet

首页> 外文期刊>Engineering Applications of Artificial Intelligence >Toward sensitive document release with privacy guarantees

【24h】

Toward sensitive document release with privacy guarantees

机译：借助隐私保护实现敏感文档的发布

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Privacy has become a serious concern for modern Information Societies. The sensitive nature of much of the data that are daily exchanged or released to untrusted parties requires that responsible organizations undertake appropriate privacy protection measures. Nowadays, much of these data are texts (e.g., emails, messages posted in social media, healthcare outcomes, etc.) that, because of their unstructured and semantic nature, constitute a challenge for automatic data protection methods. In fact, textual documents are usually protected manually, in a process known as document redaction or sanitization. To do so, human experts identify sensitive terms (i.e., terms that may reveal identities and/or confidential information) and protect them accordingly (e.g., via removal or, preferably, generalization). To relieve experts from this burdensome task, in a previous work we introduced the theoretical basis of C-sanitization, an inherently semantic privacy model that provides the basis to the development of automatic document redaction/sanitization algorithms and offers clear and a priori privacy guarantees on data protection; even though its potential benefits C-sanitization still presents some limitations when applied to practice (mainly regarding flexibility, efficiency and accuracy). In this paper, we propose a new more flexible model, named (C, g(C))-sanitization, which enables an intuitive configuration of the trade-off between the desired level of protection (i.e., controlled information disclosure) and the preservation of the utility of the protected data (i.e., amount of semantics to be preserved). Moreover, we also present a set of technical solutions and algorithms that provide an efficient and scalable implementation of the model and improve its practical accuracy, as we also illustrate through empirical experiments.

机译：隐私已成为现代信息社会的严重关切。每天交换或发布给不受信任方的许多数据的敏感性质要求负责任的组织采取适当的隐私保护措施。如今，这些数据中的许多都是文本（例如，电子邮件，在社交媒体上发布的消息，医疗结果等），由于它们的结构和语义性质，它们构成了自动数据保护方法的挑战。实际上，文本文档通常在称为文档编辑或清理的过程中手动保护。为此，人类专家识别敏感术语（即，可能揭示身份和/或机密信息的术语）并相应地对其进行保护（例如，通过去除或优选地通过概括）。为了使专家摆脱繁重的工作，在先前的工作中，我们介绍了C-sanitization的理论基础，C-sanitization是一种固有的语义隐私模型，为自动文档修订/清理算法的开发提供了基础，并提供了明确的先验隐私保证。数据保护;尽管C消毒在实践中仍然具有一些潜在的局限性（主要是关于灵活性，效率和准确性）。在本文中，我们提出了一种新的更灵活的模型，称为（C，g（C））-消毒，该模型可以直观地配置所需保护级别（即受控信息公开）和保存之间的折衷受保护数据的实用性（即要保留的语义量）。此外，我们还提供了一组技术解决方案和算法，可提供有效且可扩展的模型实现并提高模型的实际准确性，正如我们还通过经验实验所说明的那样。

著录项

来源
《Engineering Applications of Artificial Intelligence》 |2017年第3期|23-34|共12页
作者
David Sánchez; Montserrat Batet;
展开▼
作者单位

UNESCO Chair in Data Privacy, Department of Computer Science and Mathematics, Universitat Rovira i Virgili, Avda. Països Catalans, 26, 43007 Tarragona, Spain;

Internet Interdisciplinary Institute (IN3), Universitat Oberta de Catalunya, Parc Mediterrani de la Tecnologia, Av. Carl Friedrich Gauss, 5, 08860 Castelldefels, Spain;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Document redaction; Sanitization; Semantics; Ontologies; Privacy;

机译：文档编辑;消毒;语义学本体;隐私;

相似文献

外文文献
中文文献
专利

1. Authenticated Medical Documents Releasing with Privacy Protection and Release Control [J] . Liu Jianghua, Ma Jinhua, Xiang Yang, IEEE transactions on dependable and secure computing . 2021,第1期

机译：经过身份验证的医疗文件，释放隐私保护和释放控制
2. A Method of Sanitizing Privacy-Sensitive Sequence Pattern Networks Mined From Trajectories Released [J] . Zhang Haitao, Zhu Yunhong International Journal of Data Warehousing and Mining . 2019,第3期

机译：一种消毒隐私敏感序列模式网络的方法，释放的轨迹
3. Novel thermo-sensitive hydrogel system with paclitaxel nanocrystals: High drug-loading, sustained drug release and extended local retention guaranteeing better efficacy and lower toxicity [J] . Zhiqiang Lin, Wei Gao, Hongxiang Hu, Journal of Controlled Release: Official Journal of the Controlled Release Society . 2014,第Null期

机译：具有紫杉醇纳米晶体的新型热敏水凝胶系统：高载药量，持续药物释放和延长的局部保留时间，确保更高的疗效和更低的毒性
4. ∈-Differential Privacy for Microdata Releases Does Not Guarantee Confidentiality (Let Alone Utility) [C] . Krishnamurty Muralidhar, Josep Domingo-Ferrer, Sergio Martinez UNESCO chair in data privacy international conference on privacy in statistical databases . 2020

机译：∈微数据发布的差异隐私不能保证机密性（Let Alone Utility）
5. A Utility-Aware Privacy Preserving Framework For Distributed Data Mining With Worst Case Privacy Guarantee. [D] . Banerjee, Madhushri. 2011

机译：一个实用程序感知的隐私保护框架，用于具有最坏情况隐私保证的分布式数据挖掘。
6. Privacy-Enhanced and Multifunctional Health Data Aggregation under Differential Privacy Guarantees [O] . Hao Ren, Hongwei Li, Xiaohui Liang, 2016

机译：差异隐私保证下的隐私增强和多功能健康数据聚合
7. Toward sensitive document release with privacy guarantees [O] . David Sánchez 2017

机译：通过隐私保证向敏感文档发布

Toward sensitive document release with privacy guarantees

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅