首页> 外文会议>International World Wide Web Conference >Experiments with Persian Text Compression for Web
【24h】

Experiments with Persian Text Compression for Web

机译:Web的Persian文本压缩实验

获取原文

摘要

The increasing importance of Unicode for text encoding implies a possible doubling of data storage space and data transmission time, with a corresponding need for data compression. The approach presented in this paper aims to reduce the storage and the transmission time for Persian text files in web-based applications and Internet. The basic idea here is to compute the most repetitive n-grams in the Persian text and replace them by a single character in the user-defined sections of the Unicode. The compression will be done on the server side once and the decompression process is eliminated completely. The rendering process in the browser will do the decompression. There is no need for any additional program or add-ins for decompression to be installed on the browser or client side. The user needs only to download the proper Unicode font once. A genetic algorithm is utilized to select the most appropriate n-grams. In the best case, we have achieved 52.26 % reduction of the file size. The method is general, and applies equally well to English and other languages.
机译:Unicode对文本编码的越来越重要意味着数据存储空间和数据传输时间的可能加倍,具有相应的数据压缩。本文呈现的方法旨在减少基于Web的应用程序和Internet中的波斯文本文件的存储和传输时间。这里的基本想法是在波斯文本中计算最重复的n-gram,并在Unicode的用户定义的部分中通过单个字符替换它们。压缩将在服务器侧完成一次,并且完全消除了解压缩过程。浏览器中的渲染过程将执行解压缩。无法在浏览器或客户端安装的解压缩的任何其他程序或加载项。用户仅需要下载适当的Unicode字体一次。利用遗传算法选择最合适的N-GRAM。在最佳案例中,我们取得了52.26%的文件大小的减少。该方法是一般的,并且同样适用于英语和其他语言。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号