The existing data perturbation methods can hardly maintain the clustering results of the original data. As a trade-off, a privacy preserving data perturbation algorithm DPTPE is proposed, which partitions the nodes into two types. For a neighborhood dispersed node, use the average value of the nodes'k neighborhood values to replace the initial value;for a neighborhood concentrated node, randomly choose a node value from the nodes'safety neigh-borhood to replace the initial value. Experiments show that DPTPE algorithm can not only avoid leaking the data privacy, but can also better maintain the clustering utility of the data set.%针对现有数据扰动方法难以维持原始数据的聚类可用性问题,提出了一种隐私保护数据扰动算法DPTPE.基于邻域拓扑势熵将节点划分为不同类型,对于邻域分散型节点,以该节点的k邻域中节点坐标的均值替换其原始坐标;对于邻域紧密型节点,在其安全邻域中随机选择一个节点替换该节点。实验结果表明,DPTPE算法可以保护数据的隐私安全,还能够较好地维持数据集的聚类可用性。
展开▼