在共享的大数据集群中,租户竞争可能导致内存资源分配不公平以及利用效率低下.为了提高缓存利用效率和公平性,针对大数据应用的特性,提出一种增量式缓存策略称为EarnCache,即文件被访问得越多,获得的缓存资源就越多.利用文件被访问频率的历史信息,将缓存分配与替换问题抽象成优化问题,给出解决方案.并在分布式存储系统中实现了EamCache及MAX-MIN等不同算法,进行性能分析.实验表明,EarnCache可以提高大数据缓存效率和总体资源利用率.%In shared big data clusters,there exists intense competition for memory resources,which may lead to unfairness and low efficiency in cache utilization.In view of this and based on the characteristics of big data applications,we propose an incremental caching strategy called EarnCache.The basic idea is that the more frequently a file is assessed,the more cache resource it gains.We utilize file accessing irfformation,and further formulize and solve cache allocation and replacement problem as an optimization problem.EarnCache and other cache replacement algorithms like MAX-MIN are implemented on a distributed file system and analyzed in detail.The experimental evaluation demonstrates that EarnCache could enhance the cache efficiency for shared big data clusters with improved resource utilization.
展开▼