首页> 外文会议>International Conference on Information Technology Research >Real-Time Uber Data Analysis of Popular Uber Locations in Kubernetes Environment
【24h】

Real-Time Uber Data Analysis of Popular Uber Locations in Kubernetes Environment

机译:Kubernetes环境中流行优步位置的实时优步数据分析

获取原文

摘要

Data is crucial in today’s business and technology environment. There is a growing demand for Big Data applications to extract and evaluate information, which will provide the necessary knowledge that will help us make important rational decisions. These ideas emerged at the beginning of the 21st century, and every technological giant is now exploiting Big Data technologies. Big Data refers to huge and broad data collections that can be organized or unstructured. Big Data analytics is the method of analyzing massive data sets to highlight trends and patterns. Uber is using real-time Big Data to perfect its processes, from calculating Uber’s pricing to finding the optimal positioning of taxis to maximize profits. Real-time data analysis is very challenging for the implementation because we need to process data in real-time, if we use Big Data, it is more complex than before. Implementation of real-time data analysis by Uber to identify their popular pickups would be advantageous in various ways. It will require high-performance platform to run their application. So far no research has been done on real-time analysis for identifying popular Uber locations within Big Data in a distributed environment, particularly on the Kubernetes environment. To address these issues, we have created a machine learning model with a Spark framework to identify the popular Uber locations and use this model to analyze real-time streaming Uber data and deploy this system on Google Dataproc with the different number of worker nodes with enabling Kubernetes and without Kubernetes environment. With the proposed Kubernetes environment and by increasing the worker nodes of Dataproc clusters, the performance can be significantly improved. The future development will consist of visualizing the real-time popular Uber locations on Google map.
机译:数据在今天的业务和技术环境中至关重要。对大数据应用的需求不断增长,以提取和评估信息,这将提供必要的知识,以帮助我们做出重要的理性决策。这些想法在21世纪初出现,而且每个技术巨头现在正在利用大数据技术。大数据是指可以组织或非结构化的巨大和广泛的数据收集。大数据分析是分析大规模数据集以突出趋势和模式的方法。优步使用实时大数据来完善其流程,从计算优步的定价来查找出租车最佳定位以最大限度地利用利润。实时数据分析对于实现非常具有挑战性,因为我们需要实时处理数据,如果我们使用大数据,它比以前更复杂。优步执行实时数据分析以识别其流行的拾取器的各种方式是有利的。它需要高性能平台来运行其应用程序。到目前为止,在分布式环境中的大数据中识别流行优步的实时分析,迄今未完成研究,特别是在Kubernetes环境中。为了解决这些问题,我们创建了一种带有Spark框架的机器学习模型来识别流行的UBER位置,并使用此模型分析实时流优步数据,并在Google DataProc上与具有启用的不同数量的工人节点在Google DataProc上部署此系统Kubernetes和没有Kubernetes环境。通过提出的Kubernetes环境以及增加DataProc集群的工人节点,可以显着提高性能。未来的发展将包括在Google地图上可视化实时流行的优步位置。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号