首页> 外文会议>International Conference on Big Data Computing and Communications >The Read Amplification Analysis of NoSQL Database on Top of OSDs: A Case Study of HBase
【24h】

The Read Amplification Analysis of NoSQL Database on Top of OSDs: A Case Study of HBase

机译:OSD之上的NoSQL数据库的读取扩增分析:以HBase为例

获取原文

摘要

The NoSQL database has showed a great improvement for large-scale datasets storage and compute. As for the conventional deployment architecture, they can obtain tremendously good performance as storage and compute are running in the same node. Nevertheless, many NoSQL users have their own storage pool (e.g., OSD server pool) which provides different interfaces for different NoSQL databases. Many benefits are achieved from this new application scenarios, such as higher scalability, better flexibility and data maintainability. However, the physical separation of NoSQL application and storage nodes potentially influences the system performance. To better understand the new scenario, we take HBase, a common NoSQL database, as a study. HBase exploit a layered storage architecture with two main layers, a distributed database atop a distributed file system (HDFS). We perform experiments with YCSB benchmark and a new developed benchmark to evaluate the separation deployment of RegionServer and HDFS server and verify the read amplification of network I/O between them. Based on our observations we propose a novel OSD Agent direction consisting of HBase Local Storage Scanner and HBase Local Storage Compactor to reduce the amplification by filtering out the requested data from the data block on OSD server pool. From the simulation results, the OSD Agent can reduce the network traffic amplification from 36x-92x to 1.7x for HBase read operation, 1.2x-21x to 1.2x for HBase scan with column filter operation. Moreover, the OSD Agent makes the read performance less sensitive to the network bandwidth. The OSD Agent brings in great performance improvement for NoSQL Database on top of OSDs.
机译:NoSQL数据库对于大规模数据集的存储和计算显示出了很大的改进。对于常规部署体系结构,由于存储和计算在同一节点上运行,因此它们可以获得非常好的性能。但是,许多NoSQL用户都有自己的存储池(例如OSD服务器池),该存储池为不同的NoSQL数据库提供了不同的接口。这种新的应用程序场景带来了许多好处,例如更高的可伸缩性,更好的灵活性和数据可维护性。但是,NoSQL应用程序和存储节点的物理隔离可能会影响系统性能。为了更好地理解这种新情况,我们以HBase(一种常见的NoSQL数据库)为研究对象。 HBase利用具有两个主要层的分层存储体系结构,即分布式文件系统(HDFS)之上的分布式数据库。我们使用YCSB基准和新开发的基准进行实验,以评估RegionServer和HDFS服务器的分离部署,并验证它们之间网络I / O的读取放大。根据我们的观察,我们提出了一种新颖的OSD代理方向,该方向由HBase本地存储扫描程序和HBase本地存储压缩程序组成,以通过从OSD服务器池上的数据块中过滤出所请求的数据来减少放大。从仿真结果来看,OSD Agent可以将HBase读取操作的网络流量放大率从36x-92x降低到1.7x,对于具有列过滤器操作的HBase扫描,则可以将网络流量放大率从1.2x-21x降低到1.2x。此外,OSD代理使读取性能对网络带宽的敏感性降低。 OSD代理在OSD之上为NoSQL数据库带来了极大的性能改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号