首页> 外文学位 >The Use of Concept Hierarchies in Privacy Preserving Data Acquisition for Data Mining.
【24h】

The Use of Concept Hierarchies in Privacy Preserving Data Acquisition for Data Mining.

机译:在数据挖掘的隐私保护数据获取中使用概念层次结构。

获取原文
获取原文并翻译 | 示例

摘要

This thesis presents a concept hierarchy-based approach to privacy preserving data collection for data mining called the p-level model. The p-level model allows data providers to divulge information at any chosen privacy level (p-level), on any attribute. Data collected at a high p-level signifies divulgence at a higher conceptual level and thus ensures more privacy. Data providers have greater control of their privacy preferences, and have provided significantly (25-75%) more personal data values, at various p-levels, than when providing the same information using the regular, fixed-level ( f-level) method of data collection.;However, the data mining process, which involves the integration of various data values, can constitute a privacy breach if combinations of attributes at the various p-levels result in the inference of knowledge that exists at lower p-levels. Providing anonymity guarantees prior to release can further protect the collected data set from privacy breaches due to linking the released data set with external data sets. This thesis describes the p-level reduction phenomenon and proposes methods to identify and control the occurrence of this privacy breach.;One objective of this thesis is to explore the feasibility of applying data collected with the p-level approach to data mining problems. We apply data collected using the p-level approach to a data classification problem, and discover that the mining accuracy of the p-level approach classifier is comparable to that of the f-level (no privacy) approach, thus we conclude that the p-level approach is beneficial for the purpose of privacy preserving data collection.
机译:本文提出了一种基于概念层次结构的用于数据挖掘的隐私保护数据收集方法,称为p级模型。 P级模型允许数据提供者以任何选定的隐私级(P级)在任何属性上泄露信息。在高p级别收集的数据表示在较高概念级别的泄露,从而确保了更多的隐私。与使用常规固定级别(f级)方法提供相同信息相比,数据提供者可以更好地控制其隐私首选项,并且在各种p级上提供了更多(25-75%)的个人数据值。但是,如果各个p级的属性组合导致推断出存在于较低p级的知识,则涉及各种数据值集成的数据挖掘过程可能构成隐私泄露。由于将发布的数据集与外部数据集链接在一起,因此在发布之前提供匿名性保证可以进一步保护收集的数据集免受隐私侵害。本文描述了p级减少现象,并提出了识别和控制此隐私泄露事件的方法。本论文的一个目的是探讨将p级方法收集的数据应用于数据挖掘问题的可行性。我们将使用p级方法收集的数据应用于数据分类问题,并发现p级方法分类器的挖掘精度与f级(无隐私)方法的挖掘精度相当,因此得出结论:级方法对于保护隐私数据收集的目的是有益的。

著录项

  • 作者单位

    University of Calgary (Canada).;

  • 授予单位 University of Calgary (Canada).;
  • 学科 Computer science.;Information technology.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 183 p.
  • 总页数 183
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号