【24h】

Word-based block-sorting text compression

机译:基于单词的块排序文本压缩

获取原文
获取原文并翻译 | 示例

摘要

Block-sorting is an innovative compression mechanism introduced in 1994 by Burrows and Wheeler. It involves three steps: permuting the input one block at a time through the use of the Burrows-Wheeler Transform (BWT); applying a Move-To-Front (MTF) transform to each of the permuted blocks; and then entropy coding the output with a Huffman or arithmetic coder. Until now, block-sorting implementations have assumed that the input message is a sequence of characters. In this paper we extend the block-sorting mechanism to word-based models. We also consider other transformations as an alternative to MTF, and are able to show improved compression results compared to MTF. For large files of text, the combination of word-based modelling, BWT, and MTF-like transformations allows excellent compression effectiveness to be attained within reasonable resource costs.
机译:块分类是Burrows和Wheeler在1994年引入的一种创新的压缩机制。它涉及三个步骤:通过使用Burrows-Wheeler变换( BWT )一次置换一个输入块;对每个排列的块应用“向前移动”( MTF )变换;然后使用霍夫曼编码或算术编码器对输出进行熵编码。到目前为止,块排序实现都假定输入消息是字符序列。在本文中,我们将块排序机制扩展到基于单词的模型。我们还考虑将其他转换作为 MTF 的替代方法,并且与 MTF 相比,能够显示出更好的压缩结果。对于大文本文件,结合基于单词的建模, BWT MTF 转换可以在合理的资源成本内获得出色的压缩效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号