A MapReduce-Based Parallel Frequent Pattern Growth Algorithm for Spatiotemporal Association Analysis of Mobile Trajectory Big Data

Xia Dawen; Lu Xiaonan; Li Huaqing; Wang Wendong; Li Yantao; Zhang Zili

首页> 外文期刊>Complexity >A MapReduce-Based Parallel Frequent Pattern Growth Algorithm for Spatiotemporal Association Analysis of Mobile Trajectory Big Data

【24h】

A MapReduce-Based Parallel Frequent Pattern Growth Algorithm for Spatiotemporal Association Analysis of Mobile Trajectory Big Data

机译：一种基于MAPRIBUCE的移动轨迹大数据的时空关联分析的平行频繁模式生长算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Frequent pattern mining is an effective approach for spatiotemporal association analysis of mobile trajectory big data in data-driven intelligent transportation systems. While existing parallel algorithms have been successfully applied to frequent pattern mining of large-scale trajectory data, two major challenges are how to overcome the inherent defects of Hadoop to cope with taxi trajectory big data including massive small files and how to discover the implicitly spatiotemporal frequent patterns with MapReduce. To conquer these challenges, this paper presents a MapReduce-based Parallel Frequent Pattern growth (MR-PFP) algorithm to analyze the spatiotemporal characteristics of taxi operating using large-scale taxi trajectories with massive small file processing strategies on a Hadoop platform. More specifically, we first implement three methods, that is, Hadoop Archives (HAR), CombineFileInputFormat (CFIF), and Sequence Files (SF), to overcome the existing defects of Hadoop and then propose two strategies based on their performance evaluations. Next, we incorporate SF into Frequent Pattern growth (FP-growth) algorithm and then implement the optimized FP-growth algorithm on a MapReduce framework. Finally, we analyze the characteristics of taxi operating in both spatial and temporal dimensions byMR-PFP in parallel. The results demonstrate that MR-PFP is superior to existing Parallel FP-growth (PFP) algorithm in efficiency and scalability.

机译：频繁的模式挖掘是数据驱动智能运输系统中移动轨迹大数据的时空分析的有效方法。虽然现有的并行算法已经成功应用于大规模轨迹数据的频繁模式，但是两个主要挑战是如何克服Hadoop的固有缺陷，以应对出租车轨迹大数据，包括大规模的小文件，以及如何发现隐含的时空频繁与mapreduce的图案。为了征服这些挑战，本文提出了一种基于Mapreduce的平行频繁模式生长（MR-PFP）算法，分析了使用大规模出租车轨迹运行的出租车的时空特征，在Hadoop平台上具有大规模的小文件处理策略。更具体地说，我们首先实现三种方法，即Hadoop档案（Har），CombineFileInputFormat（CFIF）和序列文件（SF），以克服Hadoop的现有缺陷，然后根据其性能评估提出两种策略。接下来，我们将SF融入频繁的模式生长（FP-Grower）算法，然后在MapReduce框架上实现优化的FP-Grangic算法。最后，我们并行分析了在空间和时间尺寸的出租车的特性并行。结果表明，MR-PFP优于现有的平行FP-生长（PFP）算法，以效率和可扩展性。

著录项

来源
《Complexity》 |2018年第2期|共16页
作者
Xia Dawen; Lu Xiaonan; Li Huaqing; Wang Wendong; Li Yantao; Zhang Zili;
展开▼
作者单位

Guizhou Minzu Univ Coll Data Sci &

Informat Engn Guiyang 550025 Guizhou Peoples R China;

Guizhou Minzu Univ Coll Data Sci &

Informat Engn Guiyang 550025 Guizhou Peoples R China;

Southwest Univ Coll Elect &

Informat Engn Chongqing 400715 Peoples R China;

Southwest Univ Coll Comp &

Informat Sci Chongqing 400715 Peoples R China;

Southwest Univ Coll Comp &

Informat Sci Chongqing 400715 Peoples R China;

Southwest Univ Coll Comp &

Informat Sci Chongqing 400715 Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类大系统理论;
关键词

相似文献

外文文献
中文文献
专利

1. A MapReduce-Based Parallel Frequent Pattern Growth Algorithm for Spatiotemporal Association Analysis of Mobile Trajectory Big Data [J] . Xia Dawen, Lu Xiaonan, Li Huaqing, Complexity . 2018,第1期

机译：基于MapReduce的并行频繁模式增长算法用于移动轨迹大数据的时空关联分析
2. MapReduce-based Parallel Algorithms for Multidimensionnal Data Analysis [J] . Jie Pan, Frederic Magoules, Yann Le Biannic Journal of algorithms & computational technology . 2012,第2期

机译：基于MapReduce的多维数据分析并行算法
3. Parallel And Distributed Algorithms For Frequent Patternmining In Large Databases [J] . Syed Khairuzzaman Tanbeer, Chowdhury Farhan Ahmed, Byeong-Soo Jeong IETE Technical Review . 2009,第1期

机译：大型数据库中频繁模式挖掘的并行和分布式算法
4. The Comparison of Apriori Algorithm with Preprocessing and FP-Growth Algorithm for Finding Frequent Data Pattern in Association Rule [C] . Deo WICAKSONO, Muhammad Ihsan JAMBAK, Danny Matthew SAPUTRA Sriwijaya International Conference on Information Technology and Its Applications . 2020

机译：APRIORI算法与预处理和FP-生长算法的比较，用于在关联规则中找到频繁的数据模式
5. New algorithms for frequent sequential pattern and itemset data mining in certain and uncertain databases. [D] . Peterson, Erich Allen. 2012

机译：在某些不确定数据库中频繁进行顺序模式和项集数据挖掘的新算法。
6. MapReduce-Based Parallel Genetic Algorithm for CpG-Site Selection in Age Prediction [O] . Zahra Momeni, Mohammad Saniee Abadeh 2019

机译：基于MapReduce的并行遗传算法用于年龄预测中的CpG站点选择
7. PaMPa-HD: A Parallel MapReduce-Based Frequent Pattern Miner for High-Dimensional Data [O] . Apiletti Daniele, Baralis Elena Maria, Cerquitelli Tania, 2015

机译：PaMPa-HD：基于并行MapReduce的高维数据频繁模式挖掘器

A MapReduce-Based Parallel Frequent Pattern Growth Algorithm for Spatiotemporal Association Analysis of Mobile Trajectory Big Data

摘要

著录项

相似文献

相关主题

期刊订阅