首页> 外文会议>IEEE International Conference on Big Data >Industrial track: Architecting railway KPIs data processing with Big Data technologies
【24h】

Industrial track: Architecting railway KPIs data processing with Big Data technologies

机译:行业轨道:使用大数据技术构建铁路KPI数据处理

获取原文

摘要

in our conducted research we have built the data processing pipeline for storing railway KPIs data based on Big Data open-source technologies – Apache Hadoop, Kafka, Kafka HDFS Connector, Spark, Airflow and PostgreSQL. Created methodology for data load testing allowed to iteratively perform data load tests with increased data size and evaluate needed cluster software and hardware resources and, finally, detected bottlenecks of solution. As a result of the research we proposed architecture for data processing and storage, gave recommendations on data pipeline optimization. In addition, we calculated approximate cluster machines sizing for current dataset volume for data processing and storage services.
机译:在我们进行的研究中,我们已经建立了基于大数据开源技术(Apache Hadoop,Kafka,Kafka HDFS Connector,Spark,Airflow和PostgreSQL)存储铁路KPI数据的数据处理管道。为数据负载测试创建的方法允许迭代地执行增加数据大小的数据负载测试,并评估所需的集群软件和硬件资源,并最终发现解决方案的瓶颈。作为研究的结果,我们提出了用于数据处理和存储的体系结构,并提出了有关数据管道优化的建议。此外,我们针对当前数据集的数量计算了近似的集群计算机规模,以用于数据处理和存储服务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号