首页> 外国专利> A Parallel Association Mining Algorithm for Analyzing Passenger Travel Characteristics

A Parallel Association Mining Algorithm for Analyzing Passenger Travel Characteristics

机译:一种分析旅客出行特征的并行关联挖掘算法

摘要

#$%^&*AU2020101071A420200723.pdf#####Abstract Frequent pattern mining is an effective approach for spatiotemporal association analysis of mobile trajectory big data in data-driven intelligent transportation systems, which can provide decision making for urban transportation optimization and control of smart cities. While existing parallel algorithms have been successfully applied to frequent pattern mining of large-scale trajectory data on Hadoop employing MapReduce, two major challenges are how to overcome the inherent defects of Hadoop to cope with taxi trajectory big data including massive small files and how to discover the implicitly spatiotemporal frequent patterns with the MapReduce paradigm. To conquer these challenges, this invention presents a MapReduce-based Parallel Frequent Pattern growth algorithm, MR-PEP, to analyze the spatiotemporal characteristics of passenger travel using large-scale taxi trajectories with massive small file processing strategies on a Hadoop platform. More specifically, the present invention mainly consists of three phases. We first implement three methods, i.e., Hadoop Archives (HAR), CombineFilenputFormat (CFIF) and Sequence Files (SF), to overcome the existing defects of Hadoop, and then propose two strategies based on their performance evaluations in terms of memory consumption and execution efficiency. Next, we incorporate SF into Frequent Pattern growth (FP-growth) algorithm, and then implement the optimized FP-growth algorithm on a MapReduce framework. Finally, we analyze the characteristics of passenger travel in both spatial and temporal dimensions by MR-PFP in parallel. The present invention has broad application in big data analytics.
机译:#$%^&* AU2020101071A420200723.pdf #####抽象频繁模式挖掘是时空关联分析的有效方法数据驱动的智能交通系统中移动轨迹大数据的分析为智慧城市的城市交通优化和控制提供决策。现有的并行算法已成功应用于频繁模式最小化MapReduce在Hadoop上大规模轨迹数据的处理,两个主要挑战如何克服Hadoop的固有缺陷以应对出租车轨迹大数据包括大量小文件,以及如何发现隐含的时空频繁拍拍带有MapReduce范式的燕鸥。为了克服这些挑战,本发明提出了基于MapReduce的并行频繁模式增长算法MR-PEP,用于分析大规模滑行轨迹的旅客旅行时空特征Hadoop平台上的大量小文件处理策略。更具体地说,现在发明主要包括三个阶段。我们首先实现三种方法,即Hadoop存档(HAR),CombineFilenputFormat(CFIF)和序列文件(SF),以克服Hadoop的现有缺陷,然后根据其性能提出两种策略在内存消耗和执行效率方面进行评估。接下来,我们合并SF转换为频繁模式增长(FP-growth)算法,然后实施优化MapReduce框架上的FP-growth算法。最后,我们分析一下特征MR-PFP在空间和时间维度上同时进行旅客旅行的本发明在大数据分析中具有广泛的应用。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号