首页> 外文会议>IEEE International Conference on Cluster Computing >SoMeta: Scalable Object-centric Metadata Management for High Performance Computing
【24h】

SoMeta: Scalable Object-centric Metadata Management for High Performance Computing

机译:有关:可扩展对象的元数据管理,用于高性能计算

获取原文

摘要

Scientific data sets, which grow rapidly in volume, are often attached with plentiful metadata, such as their associated experiment or simulation information. Thus, it becomes difficult for them to be utilized and their value is lost over time. Ideally, metadata should be managed along with its corresponding data by a single storage system, and can be accessed and updated directly. However, existing storage systems in high-performance computing (HPC) environments, such as Lustre parallel file system, still use a static metadata structure composed of non-extensible and fixed amount of information. The burden of metadata management falls upon the end-users and require ad-hoc metadata management software to be developed. With the advent of "object-centric" storage systems, there is an opportunity to solve this issue. In this paper, we present SoMeta, a scalable and decentralized metadata management approach for object-centric storage in HPC systems. It provides a flat namespace that is dynamically partitioned, a tagging approach to manage metadata that can be efficiently searched and updated, and a light-weight and fault tolerant management strategy. In our experiments, SoMeta achieves up to 3.7X speedup over Lustre in performing common metadata operations, and up to 16X faster than SciDB and MongoDB for advanced metadata operations, such as adding and searching tags. Additionally, in contrast to existing storage systems, SoMeta offers scalable user-space metadata management by allowing users with the capability to specify the number of metadata servers depending on their workload.
机译:体积迅速增长的科学数据集通常附加丰富的元数据,例如它们相关的实验或模拟信息。因此,将它们难以使用,并且它们的值随时间丢失。理想情况下,应通过单个存储系统与其相应数据进行管理,并且可以直接访问和更新元数据。然而,在高性能计算(HPC)环境中的现有存储系统,例如LuStPlant Parentfure文件系统,仍然使用由非可扩展和固定信息量组成的静态元数据结构。元数据管理的负担落在最终用户身上,并要求开发临时元数据管理软件。随着“以对象”的存储系统的出现,有机会解决这个问题。在本文中,我们介绍了HPC系统中以上对象存储的可扩展和分散的元数据管理方法。它提供了一个平坦的命名空间,它是动态分区的,一个标记方法来管理可以有效地搜索和更新的元数据,以及轻量级和容错管理策略。在我们的实验中,在执行常见的元数据操作时,Somea在光泽方面取得了高达3.7倍的加速,而不是比SCIDB和MongoDB更快地为高级元数据操作,例如添加和搜索标签。此外,与现有存储系统相比,某些时候,允许用户根据其工作负载允许用户指定元数据服务器的数量来提供可扩展的用户空间元数据管理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号