首页> 外国专利> METHOD FOR ESTABLISHING INDEX ON HDFS-BASED SPARK-SQL BIG-DATA PROCESSING SYSTEM

METHOD FOR ESTABLISHING INDEX ON HDFS-BASED SPARK-SQL BIG-DATA PROCESSING SYSTEM

机译：在基于HDFS的SPARK-SQL大数据处理系统上建立索引的方法

页面导航

摘要
著录项
相似文献

摘要

Provided is a method for establishing an index on an HDFS-based Spark-SQL big-data processing system; by means of a SQL statement, an index is added to, an index is deleted from, data is inserted into, and data is deleted from an HDFS-based Spark-SQL big-data processing system; when data is being queried, automatically determining whether a query column has an index; if so, then searching for a file block contained in the index and filtering out file blocks not needing to be searched. after adding index functionality to Spark-SQL, it is possible to effectively increase query speed; in the case of a typical Spark-SQL data table, the size is 1000 GB, each file stored taking up 1 GB, the 1000 GB being divided into 1000 files; if an individual record is queried, the original approach would require scanning 1000 files; after establishing the index, scanning one file suffices, thus efficiency is increased by 1000 times. Under typical circumstances, and in view of a conventional relational database experience, a Spark-SQL database having an established index performs queries at a speed 100-10,000 times faster, or more, than a SQL statement having no index.

机译：提供了一种在基于HDFS的Spark-SQL大数据处理系统上建立索引的方法。通过SQL语句，在基于HDFS的Spark-SQL大数据处理系统中添加索引，从中删除索引，插入数据，以及删除数据。查询数据时，自动确定查询列是否有索引;如果是这样，则搜索索引中包含的文件块并过滤掉不需要搜索的文件块。向Spark-SQL添加索引功能后，可以有效提高查询速度;对于典型的Spark-SQL数据表，大小为1000 GB，每个存储的文件占用1 GB，将1000 GB分为1000个文件;如果查询个人记录，则原始方法将需要扫描1000个文件;建立索引后，扫描一个文件就足够了，因此效率提高了1000倍。在典型情况下，并且考虑到常规的关系数据库经验，具有已建立索引的Spark-SQL数据库执行查询的速度比没有索引的SQL语句快100-10,000倍或更高。

著录项

公开/公告号WO2017096939A1

专利类型
公开/公告日2017-06-15

原文格式PDF
申请/专利权人 CHINA COMMUNICATION SOFTWARE TECHNOLOGY CO. LTD.;CHINA COMMUNICATION TECHNOLOGY CO. LTD.;
展开▼

申请/专利号WO2016CN94925
发明设计人 ZHANG YUN;FENG JUN;
展开▼

申请日2016-08-12
分类号G06F17/30;
国家 WO
入库时间 2022-08-21 13:30:50

相似文献

专利
外文文献
中文文献