...
首页> 外文期刊>IEEE Transactions on Information Theory >Large Alphabet Source Coding Using Independent Component Analysis
【24h】

Large Alphabet Source Coding Using Independent Component Analysis

机译:使用独立分量分析的大字母源编码

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Large alphabet source coding is a basic and well-studied problem in data compression. It has many applications, such as compression of natural language text, speech, and images. The classic perception of most commonly used methods is that a source is best described over an alphabet, which is at least as large as the observed alphabet. In this paper, we challenge this approach and introduce a conceptual framework in which a large alphabet source is decomposed into “as statistically independent as possible” components. This decomposition allows us to apply entropy encoding to each component separately, while benefiting from their reduced alphabet size. We show that in many cases, such decomposition results in a sum of marginal entropies which is only slightly greater than the entropy of the source. Our suggested algorithm, based on a generalization of the binary independent component analysis, is applicable for a variety of large alphabet source coding setups. This includes the classical lossless compression, universal compression, and high-dimensional vector quantization. In each of these setups, our suggested approach outperforms most commonly used methods. Moreover, our proposed framework is significantly easier to implement in most of these cases.
机译:大型字母源编码是数据压缩中的一个基本且经过充分研究的问题。它具有许多应用程序,例如自然语言文本,语音和图像的压缩。对最常用方法的经典看法是,最好在至少与观察到的字母表一样大的字母表上描述源。在本文中,我们对这种方法提出了挑战,并介绍了一个概念框架,在该框架中,大型字母源被分解为“在统计上尽可能独立”的组成部分。这种分解使我们能够将熵编码分别应用于每个组件,同时受益于它们减小的字母大小。我们表明,在许多情况下,这种分解导致边际熵之和仅略大于源的熵。我们建议的算法基于对二进制独立分量分析的概括,适用于各种大型字母源编码设置。这包括经典的无损压缩,通用压缩和高维矢量量化。在每种设置中,我们建议的方法都优于最常用的方法。而且,在大多数情况下,我们提出的框架都非常容易实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号