Indexing HDFS Data in PDW: Splitting the data from the index

机译：在PDW中索引HDFS数据：从索引中拆分数据

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

There is a growing interest in making relational DBMSs work synergistically with MapReduce systems. However, there are interesting technical challenges associated with figuring out the right balance between the use and co-deployment of these systems. This paper focuses on one specific aspect of this balance, namely how to leverage the superior indexing and query processing power of a relational DBMS for data that is often more cost-effectively stored in Hadoop/HDFS. We present a method to use conventional B+-tree indices in an RDBMS for data stored in HDFS and demonstrate that our approach is especially effective for highly selective queries.

机译：使关系DBMS与MapReduce系统协同工作的兴趣日益浓厚。但是，要在这些系统的使用和共同部署之间找到适当的平衡，仍存在一些有趣的技术挑战。本文着重于这种平衡的一个特定方面，即如何利用关系DBMS的出色索引和查询处理能力来处理通常更经济高效地存储在Hadoop / HDFS中的数据。我们提出了一种在RDBMS中对存储在HDFS中的数据使用常规B +树索引的方法，并证明了我们的方法对于高度选择性的查询特别有效。

著录项

来源
《International conference on very large data bases》|2014年|1520-1528|共9页
会议地点
作者
Vinitha Reddy Gankidi; Nikhil Teletia; Jignesh M. Patel; Alan Halverson; David J. DeWitt;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A hierarchical indexing strategy for optimizing Apache Spark with HDFS to efficiently query big geospatial raster data [J] . Fei Hu, Chaowei Yang, Yongyao Jiang, International journal of digital Earth . 2020,第1a3期

机译：用HDFS优化Apache Spark的分层索引策略，以有效地查询大地理空间栅格数据
2. mDHT: a multi-level-indexed DHT algorithm to extra-large-scale data retrieval on HDFS/Hadoop architecture [J] . Yu Tang, Aihua Fan, Yingjie Wang, Personal and Ubiquitous Computing . 2014,第8期

机译：mDHT：用于在HDFS / Hadoop架构上进行超大规模数据检索的多级索引DHT算法
3. Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique [J] . Mustafa Hajeer, Dipankar Dasgupta Big Data, IEEE Transactions on . 2019,第2期

机译：使用数据感知的HDFS和进化集群技术处理大数据
4. Indexing HDFS Data in PDW: Splitting the data from the index [C] . Vinitha Reddy Gankidi, Nikhil Teletia, Jignesh M. Patel, International conference on very large data bases . 2014

机译：在PDW中索引HDFS数据：从索引中拆分数据
5. Handling big data with a data-aware HDFS using evolutionary clustering technique. [D] . Hajeer, Mustafa Hussein. 2016

机译：使用进化聚类技术通过数据感知的HDFS处理大数据。
6. Clinical Data Element Ontology for Unified Indexing and Retrieval of Data Elements across Multiple Metadata Registries [O] . Senator Jeong, Hye Hyeon Kim, Yu Rang Park, 2014

机译：临床数据元素本体用于跨多个元数据注册表统一索引和检索数据元素
7. HDF5-FastQuery: An API for Simplifying Access to Data Storage,Retrieval, Indexing and Querying [O] . Bethel, E. Wes, Gosink, Luke, Shalf, John, 2006

机译：HDF5-FastQuery：用于简化数据存储，检索，索引和查询访问的apI
8. HDF5-Fast Query: An API for Simplifying Access to Data Storage, Retrieval, Indexing and Querying [R] . Bethel, E. W. 2006

机译：HDF5快速查询：用于简化数据存储，检索，索引和查询访问的apI

Indexing HDFS Data in PDW: Splitting the data from the index

摘要

著录项

相似文献

相关主题

期刊订阅