首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Explicit Data Correlations-Directed Metadata Prefetching Method in Distributed File Systems
【24h】

Explicit Data Correlations-Directed Metadata Prefetching Method in Distributed File Systems

机译:分布式文件系统中基于数据关联的显式元数据预取方法

获取原文
获取原文并翻译 | 示例
           

摘要

Metadata performance in distributed file systems (DFS) is critical, due to the following trends: (a) the growing size of modern storage systems is expected to exceed billions of files and most files are small; (b) over half of the file accesses are metadata operations. In this work, we present SMeta, a metadata prefetching method that is seamlessly integrated into DFS for easy-of-use and significantly scales the metadata performance. Previous prefetching proposals primarily focus on mining groups of files that tend to be accessed together from the access history. Nevertheless, our study discovered that these solutions likely miss a huge number of correlated files whose co-occurrence frequency is not high enough. Unlike access correlations, we take a novel and completely different approach to explore explicit data correlations by understanding the reference relationships between files encoded in some forms of hyperlinks, which naturally exist in many applications. To embrace this new concept, SMeta explores correlations upon files are written via a light-weight pattern matching algorithm, stores correlations in the reserved extended attributes of file metadata to avoid changes in DFS APIs, and collapses multiple I/O rounds for accessing metadata of the target file and its data-correlated files into one round. A cost-efficient adaptive feedback mechanism is introduced to improve prefetching accuracy. We implemented SMeta atop of Ceph and evaluated it using synthetic and real system workloads. Compared to baselines, SMeta provides better metadata performance in terms of latency, throughput and scalability.
机译:由于以下趋势,分布式文件系统(DFS)中的元数据性能至关重要:(a)不断增长的现代存储系统规模有望超过数十亿个文件,并且大多数文件很小; (b)文件访问的一半以上是元数据操作。在这项工作中,我们介绍了SMeta,这是一种元数据预取方法,该方法已无缝集成到DFS中,以易于使用,并显着扩展了元数据性能。先前的预取建议主要集中于挖掘倾向于从访问历史记录一起访问的文件组。但是,我们的研究发现,这些解决方案可能会丢失大量的相关文件,这些文件的共现频率不够高。与访问相关性不同,我们通过一种新颖的,完全不同的方法来理解显式数据相关性,方法是理解以某种形式的超链接编码的文件之间的引用关系,而这些超链接自然存在于许多应用程序中。为了拥抱这个新概念,SMeta探索了通过轻量模式匹配算法编写文件时的相关性,将相关性存储在文件元数据的保留扩展属性中,以避免DFS API发生更改,并折叠多个I / O循环以访问文件的元数据。目标文件及其与数据相关的文件合为一体。引入了具有成本效益的自适应反馈机制,以提高预取精度。我们在Ceph之上实现了SMeta,并使用综合和实际系统工作负载对其进行了评估。与基准相比,SMeta在延迟,吞吐量和可伸缩性方面提供了更好的元数据性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号