Explicit Data Correlations-Directed Metadata Prefetching Method in Distributed File Systems

Chen Youxu; Li Cheng; Lv Min; Shao Xinyang; Li Yongkun; Xu Yinlong

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Explicit Data Correlations-Directed Metadata Prefetching Method in Distributed File Systems

【24h】

Explicit Data Correlations-Directed Metadata Prefetching Method in Distributed File Systems

机译：分布式文件系统中基于数据关联的显式元数据预取方法

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Metadata performance in distributed file systems (DFS) is critical, due to the following trends: (a) the growing size of modern storage systems is expected to exceed billions of files and most files are small; (b) over half of the file accesses are metadata operations. In this work, we present SMeta, a metadata prefetching method that is seamlessly integrated into DFS for easy-of-use and significantly scales the metadata performance. Previous prefetching proposals primarily focus on mining groups of files that tend to be accessed together from the access history. Nevertheless, our study discovered that these solutions likely miss a huge number of correlated files whose co-occurrence frequency is not high enough. Unlike access correlations, we take a novel and completely different approach to explore explicit data correlations by understanding the reference relationships between files encoded in some forms of hyperlinks, which naturally exist in many applications. To embrace this new concept, SMeta explores correlations upon files are written via a light-weight pattern matching algorithm, stores correlations in the reserved extended attributes of file metadata to avoid changes in DFS APIs, and collapses multiple I/O rounds for accessing metadata of the target file and its data-correlated files into one round. A cost-efficient adaptive feedback mechanism is introduced to improve prefetching accuracy. We implemented SMeta atop of Ceph and evaluated it using synthetic and real system workloads. Compared to baselines, SMeta provides better metadata performance in terms of latency, throughput and scalability.

机译：由于以下趋势，分布式文件系统（DFS）中的元数据性能至关重要：（a）不断增长的现代存储系统规模有望超过数十亿个文件，并且大多数文件很小；（b）文件访问的一半以上是元数据操作。在这项工作中，我们介绍了SMeta，这是一种元数据预取方法，该方法已无缝集成到DFS中，以易于使用，并显着扩展了元数据性能。先前的预取建议主要集中于挖掘倾向于从访问历史记录一起访问的文件组。但是，我们的研究发现，这些解决方案可能会丢失大量的相关文件，这些文件的共现频率不够高。与访问相关性不同，我们通过一种新颖的，完全不同的方法来理解显式数据相关性，方法是理解以某种形式的超链接编码的文件之间的引用关系，而这些超链接自然存在于许多应用程序中。为了拥抱这个新概念，SMeta探索了通过轻量模式匹配算法编写文件时的相关性，将相关性存储在文件元数据的保留扩展属性中，以避免DFS API发生更改，并折叠多个I / O循环以访问文件的元数据。目标文件及其与数据相关的文件合为一体。引入了具有成本效益的自适应反馈机制，以提高预取精度。我们在Ceph之上实现了SMeta，并使用综合和实际系统工作负载对其进行了评估。与基准相比，SMeta在延迟，吞吐量和可伸缩性方面提供了更好的元数据性能。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2019年第12期|2692-2705|共14页
作者
Chen Youxu; Li Cheng; Lv Min; Shao Xinyang; Li Yongkun; Xu Yinlong;
展开▼
作者单位

Univ Sci & Technol China Dept Comp Sci & Technol Hefei 230000 Anhui Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Distributed file system; metadata performance; prefetching; data correlations;

机译：分布式文件系统;元数据性能;预取数据相关;

相似文献

外文文献
中文文献
专利

1. Performing Initiative Data Prefetching in Distributed File Systems for Cloud Computing [J] . Jianwei Liao, Francois Trahay, Guoqiang Xiao, Cloud Computing, IEEE Transactions on . 2017,第3期

机译：在用于云计算的分布式文件系统中执行主动数据预取
2. DLS: a cloud-hosted data caching and prefetching service for distributed metadata access [J] . Bing Zhang, Brandon Ross, Tevfik Kosar International Journal of Big Data Intelligence . 2015,第3期

机译：DLS：用于分布式元数据访问的云托管数据缓存和预取服务
3. The Design and Implementation of Appointed File Prefetching for Distributed File Systems [J] . Gwan-Hwan Hwang, Hsin-Fu Lin, Chun-Chin Sy, Journal of research and practice in information technology . 2008,第2期

机译：分布式文件系统指定文件预取的设计与实现
4. A kind of Metadata Prefetch Method for Distributed File System [C] . Jingyi Zhang, Bo Jiang International Conference on Big Data Analysis and Computer Science . 2021

机译：一种分布式文件系统的元数据预取方法
5. Application of distributed shared memory to metadata storage in a parallel file system. [D] . Wolinski, Pawel D. 2005

机译：分布式共享内存在并行文件系统中的元数据存储中的应用。
6. Log-Less Metadata Management on Metadata Server for Parallel File Systems [O] . Jianwei Liao, Guoqiang Xiao, Xiaoning Peng -1

机译：用于并行文件系统的元数据服务器上的无日志元数据管理
7. A self-tuning client-side metadata prefetching scheme for wide area network file systems [O] . Bing Wei, Limin Xiao, Yao Song, 2021

机译：广域网文件系统的自调整客户端元数据预取方案

Explicit Data Correlations-Directed Metadata Prefetching Method in Distributed File Systems

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅