近似频繁模式衍生于频繁模式,综合了频繁项集与频繁子图的特点。针对该模式的研究集中在无标签图上,其应用场景主要为社交网络、语义网络、智能电网等。近似频繁模式挖掘过程同时涉及频繁项集挖掘和频繁子图挖掘,因此已有的处理频繁模式挖掘算法无法较好地解决近似频繁模式挖掘问题。基于近似频繁模式结构,将其拓展到带标签图中,引入标签集约束,并设计标签集约束近似频繁模式挖掘算法LCPP(Label-Constraint Proximity Pattern),该算法并行部署在MapReduce计算模型中,弥补了开源pFP算法处理大规模数据时效率不高的缺点。实验结果验证了该算法的有效性和可扩展性,表明了LCPP算法是pFP算法的极佳补充。%Proximity pattern is derived from frequent pattern, characterized by a combination of frequent items and fre-quent subgraphs. Research about proximity pattern is mainly concentrated on the unlabeled graph, and the main application scenarios are social network, semantic Web and smart grid, etc. Proximity pattern mining process involves both frequent items mining and frequent subgraph mining, therefore the existing methods of pattern mining can not be used directly on the issue. On the basis of the proximity pattern, this paper introduces the LCPP(Label-Constraint Proximity Pattern)algo-rithm during the label graph. The algorithm is deployed in the MapReduce parallel computing model, making up for the inefficiency of pFP algorithm when processing the large-scale database. The experimental results show that the parallel algo-rithm can not only improve the computing speed, but also has good scalability, and the LCPP algorithm is an excellent complement of pFP.
展开▼