...
首页> 外文期刊>Selected Topics in Applied Earth Observations and Remote Sensing, IEEE Journal of >Apache Spark Accelerated Deep Learning Inference for Large Scale Satellite Image Analytics
【24h】

Apache Spark Accelerated Deep Learning Inference for Large Scale Satellite Image Analytics

机译:Apache Spark加速了大规模卫星图像分析的深度学习推断

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The shear volumes of data generated from earth observation and remote sensing technologies continue to make major impact; leaping key geospatial applications into the dual data and compute-intensive era. As a consequence, this rapid advancement poses new computational and data processing challenges. We implement a novel remote sensing data flow (RESFlow) for advancing machine learning to compute with massive amounts of remotely sensed imagery. The core contribution is partitioning massive amounts of data into homogeneous distributions for fitting simple models. RESFlow takes advantage of Apache Spark and the availability of modern computing hardware to harness the acceleration of deep learning inference on expansive remote sensing imagery. The framework incorporates a strategy to optimize resource utilization across multiple executors assigned to a single worker. We showcase its deployment in both computationally and data-intensive workloads for pixel-level labeling tasks. The pipeline invokes deep learning inference at three stages; during deep feature extraction, deep metric mapping, and deep semantic segmentation. The tasks impose compute-intensive and GPU resource sharing challenges motivating for a parallelized pipeline for all execution steps. To address the problem of hardware resource contention, our containerized workflow further incorporates a novel GPU checkout routine and the ticketing system across multiple workers. The workflow is demonstrated with NVIDIA DGX accelerated platforms and offers appreciable compute speed-ups for deep learning inference on pixel labeling workloads; processing 21 028 TB of imagery data and delivering output maps at area rate of 5.245 sq.km/s, amounting to 453 168 sq.km/day-reducing a 28 day workload to 21 h.
机译:从地球观察和遥感技术产生的剪切卷继续产生重大影响;将关键地理空间应用跨越双数据和计算密集的时代。因此,这种快速进步构成了新的计算和数据处理挑战。我们实施了一种新颖的遥感数据流(Resflow),用于推进机器学习,以大量的远程感测图像计算。核心贡献正在将大量数据分配成均匀分布,以拟合简单的模型。 Resflow利用Apache Spark和现代计算硬件的可用性来利用膨胀遥感图像的深度学习推断的加速度。该框架包含了一种策略,可以在分配给单个工人的多个执行器中优化资源利用率。我们在计算上和数据密集型工作负载中展示其部署,以获得像素级标记任务。管道在三个阶段调用深度学习推论;在深度特征提取期间,深度度量映射和深度语义分割。任务强加了计算密集型和GPU资源共享对所有执行步骤的并行化管道的激励挑战。为了解决硬件资源争用的问题,我们的集装箱化工作流程还包含一个新的GPU结账例程和跨多个工人的票务系统。使用NVIDIA DGX加速平台进行了工作流程,并为像素标记工作负载提供了深度学习推理的可观计算速度UP;处理21 028 TB的图像数据和以5.245平方英尺的区域速率传递输出映射,达到28天工作负载的453 168平方米。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号