首页> 外文会议>IEEE International Conference on Big Data >Distributed Mining of Spatial High Utility Itemsets in Very Large Spatiotemporal Databases using Spark In-Memory Computing Architecture
【24h】

Distributed Mining of Spatial High Utility Itemsets in Very Large Spatiotemporal Databases using Spark In-Memory Computing Architecture

机译:使用Spark In-Memory Computing Architecture在非常大的时空数据库中分布挖掘空间高实用程序项集

获取原文

摘要

Finding Spatial High Utility Itemsets (SHUIs) in a spatiotemporal database is a challenging problem of great importance in many real-world applications. Most previous works focused on the sequential discovery of SHUIs in a database running on a single machine. Consequently, these works are not suitable for big data (or cloud-based) applications as they suffer from the scalability and fault tolerant problems. This paper proposes several novel pruning techniques to reduce the search space and present a more flexible distributed algorithm to find all desired itemsets from the database using Spark in-memory computing architecture. Our algorithm inherits several advantages of Spark, including low communication cost, fault tolerance, and high scalability. Experimental results demonstrate that the proposed algorithm has good scalability and performance on very large databases. Finally, we present a real-world navigation application in which SHUIs generated from the traffic congestion data have been employed to recommend alternative routes to the users.
机译:在Spatiotemporal数据库中寻找空间高实用项目集(Shuis)是许多真实应用中非常重要的挑战性问题。最先前的作品专注于Shuis在单台机器上运行的数据库中的顺序发现。因此,这些作品不适用于大数据(或基于云的)应用,因为它们遭受可伸缩性和容错问题。本文提出了几种新颖的修剪技术来减少搜索空间,并呈现更灵活的分布式算法,以使用火花内存计算架构从数据库中找到所有所需的项目集。我们的算法继承了火花的几个优点,包括低通信成本,容错和高可扩展性。实验结果表明,所提出的算法在非常大的数据库中具有良好的可扩展性和性能。最后,我们提出了一个真实的导航应用程序,其中已经采用了从流量拥塞数据生成的SHUIS推荐给用户的替代路由。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号