首页> 外文会议>International Conference on Advanced Cloud and Big Data >PN-FINDR: A Parallelized N-FINDR Algorithm with Spark
【24h】

PN-FINDR: A Parallelized N-FINDR Algorithm with Spark

机译:pn-findr:具有火花的并行N-FindR算法

获取原文

摘要

N-FINDR is a well-known method for the endmember extraction of hyperspectral image. However, the N-FINDR processing on hyperspectral data is both data-intensive and computing-intensive. On one side, with development of hyperspectral sensors, the magnitude of hyperspectral data grows quickly, and it is a challenging task to efficiently process the massive data volumes; on the other side, the N-FINDR algorithm needs to scan the whole dataset to compute the volume of the simplex. These make the N-FINDR algorithm very time-consuming when dealing with the big dataset. In this paper, a distributed parallel optimization of the N-FINDR algorithm (PN-FINDR) on Spark is presented. In order to accelerate the distributed parallel computation, the broadcast variables are used to reduce transmission data to each computing node, the intermediate data storage structure is designed to reduce the time consumption of shuffle data transmission, and the RDD cache is used for fast iterative calculation. The experiments conducted on real hyperspectral images of different sizes demonstrate significant improvement on the efficiency of massive hyperspectral data processing.
机译:N-FindR是一个众所周知的终结斑点图像的方法。然而,对高光谱数据的N-FindR处理是数据密集型和计算密集型的。在一侧,随着高光谱传感器的开发,高光谱数据的大小快速增长,有效地处理大量数据量是一个具有挑战性的任务;在另一边,n-findr算法需要扫描整个数据集以计算单面的音量。这些使N-FindR算法在处理大数据集时非常耗时。本文介绍了对火花上的N-FindR算法(PN-FindR)的分布式并行优化。为了加速分布式并行计算,广播变量用于将传输数据减少到每个计算节点,所以旨在减少混洗数据传输的时间消耗,并且RDD高速缓存用于快速迭代计算。在不同尺寸的实际高光谱图像上进行的实验表明了大量高光谱数据处理效率的显着提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号