首页> 外文学位 >Database Streaming Compression on Memory-Limited Machines
【24h】

Database Streaming Compression on Memory-Limited Machines

机译:内存受限机器上的数据库流压缩

获取原文
获取原文并翻译 | 示例

摘要

Dynamic Huffman compression algorithms operate on data-streams with a bounded symbol list. With these algorithms, the complete list of symbols must be contained in main memory or secondary storage. A horizontal format transaction database that is streaming can have a very large item list. Many nodes tax both the processing hardware primary memory size, and the processing time to dynamically maintain the tree.;This research investigated Huffman compression of a transaction-streaming database with a very large symbol list, where each item in the transaction database schema's item list is a symbol to compress. The constraint of a large symbol list is, in this research, equivalent to the constraint of a memory-limited machine. A large symbol set will result if each item in a large database item list is a symbol to compress in a database stream. In addition, database streams may have some temporal component spanning months or years. Finally, the horizontal format is the format most suited to a streaming transaction database because the transaction IDs are not known beforehand. This research prototypes an algorithm that will compresses a transaction database stream.;There are several advantages to the memory limited dynamic Huffman algorithm. Dynamic Huffman algorithms are single pass algorithms. In many instances a second pass over the data is not possible, such as with streaming databases. Previous dynamic Huffman algorithms are not memory limited, they are asymptotic to O(n), where n is the number of distinct item IDs. Memory is required to grow to fit the n items. The improvement of the new memory limited Dynamic Huffman algorithm is that it would have an O(k) asymptotic memory requirement; where k is the maximum number of nodes in the Huffman tree, k < n, and k is a user chosen constant. The new memory limited Dynamic Huffman algorithm compresses horizontally encoded transaction databases that do not contain long runs of 0's or 1's.
机译:动态霍夫曼压缩算法在带符号列表的数据流上运行。使用这些算法,符号的完整列表必须包含在主存储器或辅助存储器中。流式传输的水平格式交易数据库可能具有非常大的项目列表。许多节点都在负担处理硬件主内存大小和动态维护树的处理时间上的负担;这项研究调查了具有非常大的符号列表的事务流数据库的霍夫曼压缩,其中事务数据库模式的项列表中的每个项是要压缩的符号。在本研究中,大符号列表的约束等效于内存受限机器的约束。如果大型数据库项目列表中的每个项目都是要在数据库流中压缩的符号,则将产生大型符号集。此外,数据库流可能具有跨越数月或数年的时间分量。最后,水平格式是最适合流事务数据库的格式,因为事务ID事先未知。这项研究为将压缩事务数据库流的算法原型化了。内存受限动态霍夫曼算法有很多优点。动态霍夫曼算法是单遍算法。在许多情况下,例如通过流数据库,第二次传递数据是不可能的。先前的动态霍夫曼算法不受内存限制,它们是O(n)渐近的,其中n是不同项目ID的数量。需要内存才能增长以适应n个项目。新的内存受限动态霍夫曼算法的改进之处在于它将具有O(k)渐近内存需求;其中k是霍夫曼树中最大节点数,k

著录项

  • 作者

    Bruccoleri, Damon.;

  • 作者单位

    Nova Southeastern University.;

  • 授予单位 Nova Southeastern University.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2018
  • 页码 208 p.
  • 总页数 208
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号