Large-scale seismic signal analysis with Hadoop

T.G. Addair; D.A. Dodge; W.R. Walter; S.D. Ruppert

首页> 外文期刊>Computers & geosciences >Large-scale seismic signal analysis with Hadoop

【24h】

Large-scale seismic signal analysis with Hadoop

机译：使用Hadoop进行大规模地震信号分析

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In seismology, waveform cross correlation has been used for years to produce high-precision hypocenter locations and for sensitive detectors. Because correlated seismograms generally are found only at small hypocenter separation distances, correlation detectors have historically been reserved for spotlight purposes. However, many regions have been found to produce large numbers of correlated seismograms, and there is growing interest in building next-generation pipelines that employ correlation as a core part of their operation. In an effort to better understand the distribution and behavior of correlated seismic events, we have cross correlated a global dataset consisting of over 300 million seismograms. This was done using a conventional distributed cluster, and required 42 days. In anticipation of processing much larger datasets, we have re-architected the system to run as a series of MapReduce jobs on a Hadoop cluster. In doing so we achieved a factor of 19 performance increase on a test dataset. We found that fundamental algorithmic transformations were required to achieve the maximum performance increase. Whereas in the original IO-bound implementation, we went to great lengths to minimize IO, in the Hadoop implementation where IO is cheap, we were able to greatly increase the parallelism of our algorithms by performing a tiered series of very fine-grained (highly parallelizable) transformations on the data. Each of these MapReduce jobs required reading and writing large amounts of data. But, because IO is very fast, and because the fine-grained computations could be handled extremely quickly by the mappers, the net was a large performance gain.

机译：在地震学中，波形互相关已被使用多年，以产生高精度震源位置并用于敏感探测器。由于通常仅在较小的震源间隔距离处发现相关地震图，因此历史上一直将相关检测器用于聚光灯目的。但是，已经发现许多地区会产生大量的相关地震图，并且人们对构建以相关为操作核心的下一代管道的兴趣日益浓厚。为了更好地了解相关地震事件的分布和行为，我们对包含超过3亿个地震图的全球数据集进行了互相关。这是使用常规的分布式群集完成的，需要42天。为了处理更大的数据集，我们重新设计了系统，使其在Hadoop集群上作为一系列MapReduce作业运行。这样，我们在测试数据集上实现了19倍的性能提升。我们发现需要基本的算法转换来实现最大的性能提升。在最初的IO绑定实现中，我们竭尽全力使IO最小化，而在IO廉价的Hadoop实现中，我们能够通过执行一系列非常细粒度的分层（极大地提高了算法的并行度）可并行化）对数据的转换。每个MapReduce作业都需要读取和写入大量数据。但是，由于IO速度非常快，而且由于映射程序可以非常快速地处理细粒度的计算，因此网络可以大大提高性能。

著录项

来源
《Computers & geosciences》 |2014年第5期|145-154|共10页
作者
T.G. Addair; D.A. Dodge; W.R. Walter; S.D. Ruppert;
展开▼
作者单位

Google Inc., 1600 Amphitheater Parkway, Mountain View, CA 94043, USA;

Lawrence Livermore National Laboratory, 7000 East Avenue, MS 046, Livermore, CA 94550, USA;

Lawrence Livermore National Laboratory, 7000 East Avenue, MS 046, Livermore, CA 94550, USA;

Lawrence Livermore National Laboratory, 7000 East Avenue, MS 046, Livermore, CA 94550, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Correlation; Hadoop; MapReduce; Seismology;

机译：相关性Hadoop;MapReduce;地震学;

相似文献

外文文献
中文文献
专利

1. Large-scale seismic waveform quality metric calculation using Hadoop [J] . Magana-Zook S., Gaylord J. M., Knapp D. R., Computers & geosciences . 2016,第sepa期

机译：使用Hadoop的大规模地震波形质量度量计算
2. Performance Evaluation of Hadoop-based Large-scale Network Traffic Analysis Cluster [J] . MATEC Web of Conferences . 2016,第1期

机译：基于Hadoop的大规模网络流量分析集群的性能评估
3. Global sensitivity analysis for large-scale socio-hydrological models using Hadoop [J] . Hu Yao, Garcia-Cabrejo Oscar, Cai Ximing, Environmental Modelling & Software . 2015,第NOVa期

机译：使用Hadoop的大规模社会水文模型的全局敏感性分析
4. Hadoop-EDF: Large-scale Distributed Processing of Electrophysiological Signal Data in Hadoop MapReduce [C] . Yuanyuan Wu, Xiaojin Li, Jinze Liu, IEEE International Conference on Bioinformatics and Biomedicine . 2019

机译：Hadoop-EDF：Hadoop MapReduce中的电生理信号数据的大规模分布式处理
5. Hadoop seismic data analyzer. [D] . Pochi, Sumer A. J. 2015

机译：Hadoop地震数据分析器。
6. No geologic evidence that seismicity causes fault leakage that would render large-scale carbon capture and storage unsuccessful [O] . Ruben Juanes, Bradford H. Hager, Howard J. Herzog 2012

机译：没有地质证据表明地震活动会导致断层泄漏从而导致大规模的碳捕获和储存失败
7. Large-scale seismic signal analysis with Hadoop [O] . Addair T.G., Dodge D.A., Walter W.R., 2014

机译：使用Hadoop进行大规模地震信号分析

Large-scale seismic signal analysis with Hadoop

摘要

著录项

相似文献

相关主题

期刊订阅