首页> 外文会议>International Conference on Mechatronic Sciences, Electric Engineering and Computer >TIIS: A two-level inverted-index scheme for large-scale data processing in the parallel database system
【24h】

TIIS: A two-level inverted-index scheme for large-scale data processing in the parallel database system

机译:TIIS:用于并行数据库系统中大规模数据处理的两级倒排索引方案

获取原文

摘要

Based on Service-Oriented Architecture, an inexpensive solution, Parallel database middleware gather the standalone database instance to provide users with highly scalable relational data management platform. However, with the advent of the era of large-scale data, such platform has posed a serious challenge in the context of text data retrieval. Motivated by this observation, a parallel database middleware based on semi-structure data is firstly designed to support text retrieval. Then, a two-level inverted-index scheme called TIIS is designed for full-text query. The advantages of TIIS are that it can quickly locate the result data from large cluster distributed database storing large-scale data, and it can greatly reduce the network I/O and disk I/O. Experimental results show that, comparing with Hive using Hadoop Distributed File System in same environment of hardware, our system performs typical TPC-H data analysis, consuming of full-text query is declined by 90% on 2GB commercial data in average.
机译:基于廉价的解决方案面向服务的体系结构,并行数据库中间件收集独立的数据库实例,为用户提供高度可扩展的关系数据管理平台。但是,随着大规模数据时代的来临,这种平台在文本数据检索的背景下提出了严峻的挑战。基于这种观察,首先设计了一种基于半结构化数据的并行数据库中间件来支持文本检索。然后,为全文查询设计了一种称为TIIS的两级倒排索引方案。 TIIS的优点是它可以从存储大规模数据的大型群集分布式数据库中快速定位结果数据,并且可以大大减少网络I / O和磁盘I / O。实验结果表明,与在相同硬件环境下使用Hadoop分布式文件系统的Hive相比,我们的系统执行典型的TPC-H数据分析,对2GB商业数据而言,全文查询的使用量平均减少了90%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号