Anonymization and Analysis of Horizontally and Vertically Divided User Profile Databases with Multiple Sensitive Attributes

机译：具有多个敏感属性的水平和垂直划分的用户配置文件数据库的匿名化和分析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Preventing the identification of individuals is important when data analyzers have to guarantee the safety of the data analysis they work with. A method proposed to solve this problem entails altering a part of the data value or deleting it. As to the processes, attributes of the individual data are divided into three groups: identifier (ID), quasi-identifier (QID), and sensitive attribute (SA). ID is the data that identify an individual directly, such as name. QID is the attributes that could identify an individual by combining them, such as age and birthplace. SA is very important information and should not be exposed when the data is identified to an individual. Utilizing these concepts, a safety metric for the data, such as l-diversity, is proposed so far. Under l-diversity, we use the assumption that the SA value is not known for anyone, and we process the data to prevent attackers from identifying. However, there are scenarios in which existing methods cannot protect the data against an invasion of privacy. In an analysis completed by multiple organizations, they integrated their data to carry out the effective data research. Although they can obtain profitable results, the integrated data could include information that attackers use to identify people. Specifically speaking, if the attacker is an institute providing data, they can use their own data’ SA value as a QID value. The assumption of l-diversity is violated, so the existing safety metric loses its effect on protecting data. In this paper, we propose a new anonymization method to conceal organizations’ important data by inserting dummy values, thereby enabling analysts to use the data safely. At the same time, we provide a calculating method to decrease the influence of the noise generated from the dummy insertion. We confirm these methods’ effectiveness by measuring accuracy in a data analysis experiments.

机译：当数据分析人员必须保证与其一起工作的数据分析的安全性时，防止个人身份识别很重要。为解决该问题而提出的方法需要改变数据值的一部分或将其删除。关于处理，各个数据的属性分为三组：标识符（ID），准标识符（QID）和敏感属性（SA）。 ID是直接标识个人的数据，例如姓名。 QID是可以通过组合个人来识别个人的属性，例如年龄和出生地。 SA是非常重要的信息，在将数据标识给个人时不应该公开。到目前为止，利用这些概念，提出了一种数据的安全度量，例如l分集。在l多样性下，我们使用一个假设，即任何人都不知道SA值，并且我们处理数据以防止攻击者识别。但是，在某些情况下，现有方法无法保护数据免受隐私侵害。在多个组织完成的分析中，他们整合了数据以进行有效的数据研究。尽管他们可以获得有利可图的结果，但集成数据可能包含攻击者用来识别人员的信息。具体地说，如果攻击者是提供数据的机构，则他们可以使用自己数据的SA值作为QID值。违反了l多样性的假设，因此现有的安全指标失去了对数据保护的作用。在本文中，我们提出了一种新的匿名化方法，即通过插入虚拟值来隐藏组织的重要数据，从而使分析人员能够安全地使用数据。同时，我们提供了一种计算方法，以减少虚拟插入所产生的噪声的影响。我们通过在数据分析实验中测量准确性来确认这些方法的有效性。

著录项

来源
《International Conference on Service Operations and Logistics, and Informatics》|2018年|262-267|共6页
会议地点
作者
Yuki Ina; Yuichi Sei; Yasuyuki Tahara; Akihiko Ohsuga;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Databases; Organizations; Diseases; Data analysis; Safety; Data privacy; Lung;

机译：数据库;组织;疾病;数据分析;安全性;数据隐私;肺;

相似文献

外文文献
中文文献
专利

1. Matching Anonymized and Obfuscated Time Series to Users’ Profiles [J] . Takbiri Nazanin, Houmansadr Amir, Goeckel Dennis L., IEEE Transactions on Information Theory . 2019,第2期

机译：将匿名和混淆时间序列与用户个人资料匹配
2. Attribute susceptibility and entropy based data anonymization to improve users community privacy and utility in publishing data [J] . Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2020,第8期

机译：属性易感性和基于熵的数据匿名，以提高用户社区隐私和公用事业在发布数据中
3. Restricted Sensitive Attributes-based Sequential Anonymization (RSA-SA) approach for privacy-preserving data stream publishing [J] . Saad A. Abdelhameed, Sherin M. Moussa, Mohamed E. Khalifa Knowledge-Based Systems . 2019,第JANa15期

机译：基于受限敏感属性的顺序匿名化（RSA-SA）方法，用于保护隐私的数据流发布
4. Anonymization and Analysis of Horizontally and Vertically Divided User Profile Databases with Multiple Sensitive Attributes [C] . Yuki Ina, Yuichi Sei, Yasuyuki Tahara, International Conference on Service Operations and Logistics, and Informatics . 2018

机译：具有多个敏感属性的水平和垂直划分用户配置文件数据库的匿名化和分析
5. Prediction of concentration profiles of a particle-laden slurry flow in horizontal and vertical pipes [D] . Ramisetty, Karthik 2010

机译：预测水平和垂直管道中含颗粒泥浆流的浓度分布
6. Collapsing the Vertical–Horizontal Divide: An Ethnographic Study of Evidence-Based Policymaking in Maternal Health [O] . Dominique P. Béhague, Katerini T. Storeng 2008

机译：弥合纵向-横向鸿沟：一项基于证据的孕产妇保健政策制定的民族志研究
7. Attribute susceptibility and entropy based data anonymization to improve users community privacy and utility in publishing data [O] . Abdul Majeed, Sungchang Lee 2020

机译：属性易感性和基于熵的数据匿名，以提高用户社区隐私和公用事业在发布数据中

Anonymization and Analysis of Horizontally and Vertically Divided User Profile Databases with Multiple Sensitive Attributes

摘要

著录项

相似文献

相关主题

期刊订阅