首页> 外文会议>2013 International Conference on Computing, Electrical and Electronics Engineering >ETAOSD: Static dictionary-based transformation method for text compression
【24h】

ETAOSD: Static dictionary-based transformation method for text compression

机译:ETAOSD:基于静态字典的文本压缩转换方法

获取原文
获取原文并翻译 | 示例

摘要

The aim of this paper is to present a new static dictionary-based algorithm for text transformation to increase the data compression ratio when using standard compression tools. The basic idea of the new algorithm is to define a pattern for each word in a static dictionary by replacing all or most of the characters in the words of the dictionary by the most frequently used character in any text file. The proposed algorithm transforms any text file into another encrypted file with a size almost the same as that of the original text file but with different statistical properties. The new transformation method has been designed, implemented, and tested using Gutenburg Corpus. Generally, the output result has shown different levels of enhancements on different common standard data compression tools such as Arithmetic, Huffman, Bzip2, Gzip and WinZip. The compression performance of all common compression tools has been enhanced especially when the patterns of the transformed words passed through costless running length encoding (RLE) algorithm. On using Bzip2, the resultant output files produced about 76.75% as compression ratio with 1.88 as average code length. The final result is very promising and it could be enhanced more in case of applying dynamic dictionary-based text transformation technique.
机译:本文的目的是提出一种新的基于静态字典的文本转换算法,以在使用标准压缩工具时提高数据压缩率。新算法的基本思想是通过用任何文本文件中最常用的字符替换字典单词中的所有或大部分字符,为静态词典中的每个单词定义一个模式。所提出的算法将任何文本文件转换为另一个加密文件,该文件的大小与原始文本文件的大小几乎相同,但具有不同的统计属性。已经使用古腾堡语料库设计,实施和测试了新的转换方法。通常,输出结果在不同的通用标准数据压缩工具(例如,算术,霍夫曼,Bzip2,Gzip和WinZip)上显示出不同程度的增强。所有常用压缩工具的压缩性能都得到了增强,尤其是当转换后的单词的模式通过无成本的运行长度编码(RLE)算法时。使用Bzip2时,生成的输出文件的压缩率约为76.75%,平均代码长度为1.88。最终结果是非常有希望的,并且在应用基于动态字典的文本转换技术的情况下可以进一步增强。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号