首页> 外文期刊>International Journal of Computers & Applications >Subspace-based aggregation for enhancing utility, information measures, and cluster identification in privacy preserved data mining on high-dimensional continuous data
【24h】

Subspace-based aggregation for enhancing utility, information measures, and cluster identification in privacy preserved data mining on high-dimensional continuous data

机译:Subspace-based aggregation for enhancing utility, information measures, and cluster identification in privacy preserved data mining on high-dimensional continuous data

获取原文
获取原文并翻译 | 示例
       

摘要

Clustering is a data mining technique that has been effectively used in the last few decades for knowledge extraction. Privacy is a major problem while releasing data for clustering and therefore privacy-preserving data mining (PPDM) algorithms have been developed. Aggregation is a popular PPDM technique that has been used. However, in the last few years, certain applications require that data mining be performed on high-dimensional data. The present privacy preservation techniques perform aggregation in a univariate manner along each dimension. This affects the utility measures, information measures, and especially retention of original clusters. This paper proposes a new technique called as subspace-based aggregation (SBA). SBA categorizes the dimensions into dense and non-dense subspaces based on the density of points. Aggregation is performed separately for dense and non-dense subspaces. This approach helps to maximize utility measures, information measures, and retention of clusters. SBA is run on high-dimensional continuous datasets from UCI Machine Learning repository. SBA is compared with related work methods such as SINGLE, SIMPLE, MDAV, and PPPCA. SBA provides an improvement of 66 in utility, 400 in cluster identification, 5 in co-variance, and standard deviation.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号