首页> 外文会议>First EurAsian Conference on EurAsia-ICT 2002: Information and Communication Technology, Oct 29-31, 2002, Shiraz, Iran >Storage and Querying of High Dimensional Sparsely Populated Data in Compressed Representation
【24h】

Storage and Querying of High Dimensional Sparsely Populated Data in Compressed Representation

机译:以压缩表示形式存储和查询高维稀疏数据

获取原文
获取原文并翻译 | 示例

摘要

Storage and querying of high dimensional sparsely populated data creates new challenge to conventional horizontal model. It requires supporting large number of columns and frequently altering of database schema. The sparsity of data degrades performance in both time and space. A 3-ary vertical representation can be used. But the cardinality of the vertical table grows exponentially when the density of the non-null values increases. It is also difficult to support multiple data types using a single vertical table. In this paper, we have presented a compressed 1-ary vertical representation where schema evolution is easy and size grows linearly with non-null density. Queries can be processed on compressed form of data without decompression. Decompression is done only when the result is necessary. We have considered three alternative representations: 3-ary uncompressed vertical, 1-ary compressed bit-array and 1-ary compressed offset. Experimental results show the superiority of 1-ary offset representation in both space and time.
机译:高维稀疏数据的存储和查询给传统的水平模型带来了新的挑战。它需要支持大量的列并经常更改数据库架构。数据稀疏性会降低时间和空间性能。可以使用三进制垂直表示。但是,当非null值的密度增加时,垂直表的基数将呈指数增长。使用单个垂直表来支持多种数据类型也很困难。在本文中,我们提出了一种压缩的一元垂直表示形式,其中模式演化容易,并且大小随非空密度线性增长。查询可以以压缩形式的数据进行处理而无需解压缩。仅在需要结果时才进行减压。我们考虑了三种替代表示形式:3进制未压缩垂直,1进制压缩位数组和1进制压缩偏移量。实验结果表明,一元偏移量表示在空间和时间上均具有优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号