首页> 外文会议>International Conference on Neural Information Processing >Privacy-Preserving K-Means Clustering Upon Negative Databases
【24h】

Privacy-Preserving K-Means Clustering Upon Negative Databases

机译:保留隐私k-means群集在负数据库时

获取原文

摘要

Data mining has become very popular with the arrival of big data era, but it also raises privacy issues. Negative database (NDB) is a new type of data representation which stores the negative image of data and can protect privacy while supporting some basic data mining operations such as classification and clustering. However, the existing clustering algorithm upon NDBs is based on Hamming distance, when facing datasets which have many categories for each attribute, the encoded data will become very long and resulting in low computational efficiency. In this paper, we propose a privacy-preserving k-means clustering algorithm based on Euclidean distance upon NDBs. The main step of k-means algorithm is to calculate the distance between each record and cluster centers, in order to solve the problem of privacy disclosure in this step, we transform each record in database into an NDB and propose a method to estimate Euclidean distance from a binary string and an NDB. Our work opens up new ideas for data mining upon negative database.
机译:数据挖掘已经变得非常受到大数据时代的到来,但它也提出了隐私问题。否定数据库(NDB)是一种新的数据表示,其存储数据的负图像,并且可以保护隐私,同时支持一些基本数据挖掘操​​作,例如分类和聚类。然而,在NDB上的现有聚类算法基于汉明距离,当面对每个属性具有许多类别的数据集时,编码数据将变得非常长并且导致计算效率低。在本文中,我们提出了一种基于NDBS的欧几里德距离的隐私保留的K-Means聚类算法。 K-means算法的主要步骤是计算每个记录和集群中心之间的距离,以解决本步骤中的隐私披露问题,我们将数据库中的每个记录转换为NDB,并提出一种估计欧几里德距离的方法来自二进制字符串和NDB。我们的工作为负数据库的数据挖掘开辟了新的想法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号