首页> 外文会议>International Conference on Big Data Computing and Communications >The Read Amplification Analysis of NoSQL Database on Top of OSDs: A Case Study of HBase
【24h】

The Read Amplification Analysis of NoSQL Database on Top of OSDs: A Case Study of HBase

机译:OSDS顶部NoSQL数据库的读取放大分析:HBase的案例研究

获取原文

摘要

The NoSQL database has showed a great improvement for large-scale datasets in terms of storage and computing. As for the conventional deployment architecture, they can obtain tremendously good performance as storage and compute components are running in the same node. Nevertheless, many NoSQL users have their own storage pool (e.g., OSD server pool) which provides different interfaces for different NoSQL databases. Many benefits are achieved from this new application scenarios, such as higher scalability, better flexibility and data maintainability. However, the physical separation of NoSQL application and storage nodes potentially influences the system performance. To better understand the new scenario, we take HBase, a common NoSQL database, as a study. HBase exploits a layered storage architecture with two main layers, a distributed database atop a distributed file system (HDFS). We perform experiments with YCSB benchmark and a new developed benchmark to evaluate the separation deployment of RegionServer and HDFS server and verify the read amplification of network I/O between them. Based on our observations, we propose a novel OSD Agent scheme consisting of HBase Local Storage Scanner and HBase Local Storage Compactor to reduce the amplification by filtering out the requested data from the data block on OSD server pool. From the simulation results, the OSD Agent can reduce the network traffic amplification from 36x-92x to 2x for HBase read operation, 1.2x-21x to 1.2x for HBase scan with column filter operation. Moreover, the OSD Agent makes the read performance less sensitive to the network bandwidth. The OSD Agent brings in great performance improvement for NoSQL Database on top of OSDs.
机译:在存储和计算方面,NoSQL数据库对大型数据集显示出巨大改进。至于传统的部署架构,它们可以获得极大的性能作为存储和计算组件在同一节点中运行。尽管如此,许多NoSQL用户拥有自己的存储池(例如,OSD服务器池),为不同的NoSQL数据库提供不同的接口。从这种新应用场景实现了许多好处,例如更高的可扩展性,更好的灵活性和数据可维护性。但是,NoSQL应用程序和存储节点的物理分离可能会影响系统性能。为了更好地了解新的场景,我们将HBase是一个普通的NoSQL数据库,作为一项研究。 HBase利用分层存储体系结构,其中分层存储体系结构是分布式文件系统(HDFS)的分布式数据库。我们使用YCSB基准测试和新的开发基准进行实验,以评估RegionServer和HDFS服务器的分离部署,并验证它们之间的网络I / O的读取放大。根据我们的观察,我们提出了一种新颖的OSD代理方案,包括HBase本地存储扫描仪和HBase本地存储压缩机,以通过从OSD服务器池中的数据块中滤除所请求的数据来减少放大。从仿真结果中,OSD代理可以将网络流量放大从36倍-92倍降至2x,为HBase读取操作,1.2×21倍为1.2倍,对于HBase扫描,具有列过滤器操作。此外,OSD代理使读取性能对网络带宽不太敏感。 OSD代理在OSDS上为NOSQL数据库带来了很大的性能改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号