首页> 外文会议>IEEE International Conference on Big Data >Efficient Data Management in Neutron Scattering Data Reduction Workflows at ORNL
【24h】

Efficient Data Management in Neutron Scattering Data Reduction Workflows at ORNL

机译:在ornl的中子散射数据减少工作流程中的高效数据管理

获取原文

摘要

Oak Ridge National Laboratory (ORNL) experimental neutron science facilities produce 1.2 TB a day of raw event-based data that is stored using the standard metadata-rich NeXus schema built on top of the HDF5 file format. Performance of several data reduction workflows is largely determined by the amount of time spent on the loading and processing algorithms in Mantid, an open-source data analysis framework used across several neutron sciences facilities around the world. The present work introduces new data management algorithms to address identified input output (I/O) bottlenecks on Mantid. First, we introduce an in-memory binary-tree metadata index that resemble NeXus data access patterns to provide a scalable search and extraction mechanism. Second, data encapsulation in Mantid algorithms is optimally redesigned to reduce the total compute and memory runtime footprint associated with metadata I/O reconstruction tasks. Results from this work show speed ups in wall-clock time on ORNL data reduction workflows, ranging from 11% to 30% depending on the complexity of the targeted instrument-specific data. Nevertheless, we highlight the need for more research to address reduction challenges as experimental data volumes increase.
机译:橡木岭国家实验室(ORNL)实验中子科学设施产生1.2 TB的基于事件的数据,使用基于HDF5文件格式之上的标准元数据 - 丰富的Nexus模式存储。若干数据减少工作流程的性能主要由在螳螂中的加载和处理算法上花费的时间量,是在世界各地的多个中子科学设施中使用的开源数据分析框架。本工作引入了新的数据管理算法,以解决秘密上的识别输入输出(I / O)瓶颈。首先,我们介绍了一种内存的二进制树元数据索引,其类似于Nexus数据访问模式来提供可伸缩的搜索和提取机制。其次,MantiD算法中的数据封装最佳地重新设计,以减少与元数据I / O重建任务相关的总计算和内存运行时占用。这项工作的结果显示在ORNL数据减少工作流程上的壁钟时间上的速度UPS,根据目标仪器特定数据的复杂性,从11%到30%。然而,由于实验数据量增加,我们突出了更多研究以解决减少挑战的研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号