PARSUC: A Parallel Subsampling-Based Method for Clustering Remote Sensing Big Data

机译：PARSUC：基于并行子采样的遥感大数据聚类方法

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Remote sensing big data (RSBD) is generally characterized by huge volumes, diversity, and high dimensionality. Mining hidden information from RSBD for different applications imposes significant computational challenges. Clustering is an important data mining technique widely used in processing and analyzing remote sensing imagery. However, conventional clustering algorithms are designed for relatively small datasets. When applied to problems with RSBD, they are, in general, too slow or inefficient for practical use. In this paper, we proposed a parallel subsampling-based clustering (PARSUC) method for improving the performance of RSBD clustering in terms of both efficiency and accuracy. PARSUC leverages a novel subsampling-based data partitioning (SubDP) method to realize three-step parallel clustering, effectively solving the notable performance bottleneck of the existing parallel clustering algorithms; that is, they must cope with numerous repeated calculations to get a reasonable result. Furthermore, we propose a centroid filtering algorithm (CFA) to eliminate subsampling errors and to guarantee the accuracy of the clustering results. PARSUC was implemented on a Hadoop platform by using the MapReduce parallel model. Experiments conducted on massive remote sensing imageries with different sizes showed that PARSUC (1) provided much better accuracy than conventional remote sensing clustering algorithms in handling larger image data; (2) achieved notable scalability with increased computing nodes added; and (3) spent much less time than the existing parallel clustering algorithm in handling RSBD.

机译：遥感大数据（RSBD）通常具有大量，多样性和高维度的特征。从RSBD挖掘用于不同应用程序的隐藏信息会带来巨大的计算挑战。聚类是一种广泛用于处理和分析遥感影像的重要数据挖掘技术。但是，常规的聚类算法是为相对较小的数据集设计的。通常，当将其应用于RSBD问题时，它们对于实际使用而言太慢或效率低下。在本文中，我们提出了一种基于并行子采样的聚类（PARSUC）方法，以从效率和准确性两方面提高RSBD聚类的性能。 PARSUC利用一种新颖的基于子采样的数据分区（SubDP）方法来实现三步并行聚类，有效解决了现有并行聚类算法的显着性能瓶颈；也就是说，他们必须应对大量重复计算才能得出合理的结果。此外，我们提出了一种质心滤波算法（CFA），以消除二次采样误差并保证聚类结果的准确性。 PARSUC通过使用MapReduce并行模型在Hadoop平台上实现。在不同大小的大型遥感影像上进行的实验表明，在处理较大的影像数据时，PARSUC（1）比传统的遥感聚类算法具有更高的精度；（2）通过增加计算节点实现了显着的可伸缩性；（3）在处理RSBD上花费的时间比现有的并行聚类算法少得多。

著录项

期刊名称 Sensors (Basel Switzerland)
作者
Huiyu Xia; Wei Huang; Ning Li; Jianzhong Zhou; Dongying Zhang;
展开▼
作者单位

展开▼
年(卷),期 2019(19),15
年度 2019
页码 3438
总页数 19
原文格式 PDF
正文语种
中图分类
关键词
clustering parallel computing remote sensing big data MapReduce;

机译：聚类;并行计算;遥感大数据;MapReduce;

相似文献

外文文献
中文文献
专利

1. PARSUC: A Parallel Subsampling-Based Method for Clustering Remote Sensing Big Data [J] . Huiyu Xia, Wei Huang, Ning Li, Sensors . 2019,第15期

机译：ParSuc：基于并行的遥感大数据集群的基于分支的方法
2. Building an Elastic Parallel OGC Web Processing Service on a Cloud-Based Cluster: A Case Study of Remote Sensing Data Processing Service [J] . Baoxuan Jin, Guiwei Shao, Jing Fu, Sustainability . 2015,第10期

机译：基于云的集群构建弹性并行OGC Web处理服务：遥感数据处理服务的案例研究
3. Emerging Unmanned Aerial Remote Sensing System for Intertidal Zone Modeling: A Low-Cost Method of Collecting Remote Sensing Data for Modeling Short-Term Effects of Sea Level Rise, Part II: Close-Range Airborne Remote Sensing [J] . Nicholas DiGruttolo, Ahmed Mohamed Surveying and land information science . 2010,第3期

机译：新兴的潮间带建模无人航空遥感系统：一种低成本的遥感数据收集方法，用于对海平面上升的短期影响进行建模，第二部分：近距离机载遥感
4. Parallel ISODATA Clustering of Remote Sensing Images Based on MapReduce [C] . Li Bo, Zhao Hui, Lv ZhenHua 2010 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery . 2010

机译：基于MapReduce的遥感影像并行ISODATA聚类。
5. Estimating Aboveground Biomass in Interior Alaska: Statistical Methods for Coupling Remotely Sensed Data with Field Observations to Improve Precision [D] . Babcock, Chad. 2017

机译：估计阿拉斯加内部的地上生物量：将遥感数据与现场观测耦合以提高精度的统计方法
6. Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets [O] . D. D. Shrimankar, S. R. Sathe 2016

机译：大型生物数据集基于新图块的并行编程模型对SMP节点和工作站集群的并行算法进行分析
7. Building an Elastic Parallel OGC Web Processing Service on a Cloud-Based Cluster: A Case Study of Remote Sensing Data Processing Service [O] . Xicheng Tan, Liping Di, Meixia Deng, 2015

机译：在基于云的集群上构建弹性并行OGC Web处理服务：远程感知数据处理服务的案例研究

PARSUC: A Parallel Subsampling-Based Method for Clustering Remote Sensing Big Data

摘要

著录项

相似文献

相关主题

期刊订阅