Differentially private maximal frequent sequence mining

Xiang Cheng; Sen Su; Shengzhi Xu; Peng Tang; Zhengyi Li

首页> 外文期刊>Computers & Security >Differentially private maximal frequent sequence mining

【24h】

Differentially private maximal frequent sequence mining

机译：差分私有最大频繁序列挖掘

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, we study the problem of designing a differentially private algorithm for mining maximal frequent sequences, which can not only achieve high data utility and a high degree of privacy, but also provide high time efficiency. To solve this problem, we present a new differentially private algorithm, which is referred to as DP-MFSM. DP-MFSM consists of three phases: pre-processing phase, expected frequent sequence mining (ESM) phase, and candidate extraction and verification (CEV) phase. Specifically, in the pre-processing phase, we first extract some statistical information from the input database, and use the extracted information to determine the values of some variables which will be used in the ESM phase. Then, in the ESM phase, we randomly partition the input database into several sub-databases, and use a partition-based ESM technique to find expected frequent sequences, which are a subset of candidate frequent sequences and more likely to be frequent. At last, in the CEV phase, we extract candidate maximal frequent sequences from the discovered expected frequent sequences, and use a splitting-based technique to verify which candidates are actually frequent in the input database. Through privacy analysis, we show that our DP-MFSM algorithm is ε-differentially private. Extensive experiments on real-world datasets illustrate that our DP-MFSM algorithm can substantially outperform alternative approaches.

机译：本文研究了一种设计最大挖掘频繁序列的差分私有算法的问题，该算法不仅可以实现较高的数据利用率和高度的保密性，而且可以提供较高的时间效率。为了解决这个问题，我们提出了一种新的差分私有算法，称为DP-MFSM。 DP-MFSM包含三个阶段：预处理阶段，预期频繁序列挖掘（ESM）阶段和候选者提取和验证（CEV）阶段。具体来说，在预处理阶段，我们首先从输入数据库中提取一些统计信息，然后使用提取的信息来确定将在ESM阶段中使用的某些变量的值。然后，在ESM阶段，我们将输入数据库随机分为几个子数据库，并使用基于分区的ESM技术来查找预期的频繁序列，该序列是候选频繁序列的子集，并且更有可能是频繁的。最后，在CEV阶段，我们从发现的预期频繁序列中提取候选最大频繁序列，并使用基于拆分的技术来验证输入数据库中哪些候选实际上是频繁的。通过隐私分析，我们表明我们的DP-MFSM算法是ε-差分私有的。在真实数据集上进行的大量实验表明，我们的DP-MFSM算法可以大大胜过其他方法。

著录项

来源
《Computers & Security》 |2015年第11期|175-192|共18页
作者
Xiang Cheng; Sen Su; Shengzhi Xu; Peng Tang; Zhengyi Li;
展开▼
作者单位

State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China;

State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China;

State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China;

State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China;

State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Differential privacy; Maximal frequent sequence mining; Frequent sequence mining; Length reduction; Threshold relaxation;

机译：差异性隐私;最大频繁序列挖掘;频繁的序列挖掘;长度减少;阈值松弛;

相似文献

外文文献
中文文献
专利

1. Differentially Private Frequent Sequence Mining [J] . Shengzhi Xu, Xiang Cheng, Sen Su, IEEE Transactions on Knowledge and Data Engineering . 2016,第11期

机译：差分私有频繁序列挖掘
2. FMaxCloHUSM: An efficient algorithm for mining frequent closed and maximal high utility sequences [J] . Tin Truong, Hai Duong, Bac Le, Engineering Applications of Artificial Intelligence . 2019,第OCTa期

机译：FMaxCloHUSM：一种有效的算法，用于挖掘频繁关闭和最大的高效用序列
3. FMaxCloHUSM: An efficient algorithm for mining frequent closed and maximal high utility sequences [J] . Tin Truong, Hai Duong, Bac Le, Engineering Applications of Artificial Intelligence . 2019,第Octa期

机译：FMaxCloHUSM：一种有效的算法，用于挖掘频繁关闭和最大的高效用序列
4. Differentially private frequent sequence mining via sampling-based candidate pruning [C] . Shengzhi Xu, Sen Su, Xiang Cheng, IEEE international conference on data engineering . 2015

机译：通过基于采样的候选修剪进行差分私有频繁序列挖掘
5. Algorithms for Mining Maximal Quasi Frequent Subnetworks [D] . El Radie, Eihab Salah. 2018

机译：用于采矿最大准则频繁子网的算法
6. Differentially Private Frequent Sequence Mining via Sampling-based Candidate Pruning [O] . Shengzhi Xu, Sen Su, Xiang Cheng, -1

机译：通过基于采样的候选修剪进行差分私有频繁序列挖掘
7. Differentially Private Frequent Sequence Mining via Sampling-based Candidate Pruning [O] . Shengzhi Xu, Sen Su, Xiang Cheng, 2016

机译：基于抽样的候选剪枝差分私有频繁序列挖掘

Differentially private maximal frequent sequence mining

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅