Sequence-Patterns Entropy and Infinite Alphabets

机译：序列模式熵和无限字母表

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The entropy of patterns of sequences generated by independently identically distributed (i.i.d.) sources with unknown large, possibly in nite, alphabets is investigated. A pattern is a sequence of indices that contains all consecutive integer indices in increasing order of rst occurrence. If the alphabet of a source that generated a sequence is unknown, the inevitable cost of coding the unknown alphabet symbols can be exploited to create the pattern of the sequence. This pattern can in turn be compressed by itself. We extend our previous upper bounds on the entropy of patterns generated by a bounded alphabet to unbounded, possibly in nite, alphabets. Unlike the bounded case, we now allow alphabets with symbols that occur with both high and very low probabilities. We study the effect of all very low probability letters on the pattern entropy. All the low probability letters are collapsed into one symbol. Beyond the contribution of that symbol to the entropy, and unlike i.i.d. sequences, the additional contribution to the entropy of patterns of length n of all letters with probability 1/n~(1+ε) or smaller, for some arbitrarily small ε, is shown to be of o(n) over the whole sequence (and o(1) per symbol). The same contribution of all letters with probability 1/n~(2+ε) or smaller is shown to be o(1) for the whole sequence. If an i.i.d. source with an in nite alphabet has only letters with probability 1/n~(2+ε), the entropy of its patterns approaches zero, i.e., the only likely pattern is the pattern 123... n. This is in contrast to the i.i.d. entropy that is super-linear in n. The results are derived through a design of a low-complexity sequential coding method for patterns that achieves the upper bound.

机译：研究了通过独立相同地分布（I.I.D.）来源产生的序列模式的熵，其中可能在含有液体中的未知字母表中。模式是一系列索引，其中包含越来越多的Integer indectrence的所有连续整数索引。如果生成序列的源的字母表是未知的，则可以利用编码未知字母符号的不可避免的成本来创建序列的模式。这种模式又可以自身压缩。我们将先前的上限扩展到由有界字母表生成的模式熵，以无限地，可能在NITE，字母表中。与界定的情况不同，我们现在允许具有高概率和非常低的符号的字母表。我们研究了所有非常低概率字母对模式熵的影响。所有低概率字母都折叠成一个符号。超出该符号对熵的贡献，而不是i.i.d.序列，对于一些具有概率1 / n〜（1 +ε）或更小的概率的长度N的熵的额外贡献显示在整个序列上（和o（1）每个符号）。对于整个序列，显示了具有概率1 / n〜（2 +ε）或更小的所有字母的相同贡献是O（1）。如果是I.I.D.具有液位字母表的源仅具有概率1 / n〜（2 +ε）的字母，其图案的熵接近零，即，唯一可能的模式是模式123 ... n。这与i.i.d。熵在n中是超线性的。结果是通过设计用于实现上限的模式的低复杂性顺序编码方法。

著录项

来源
《Annual Allerton Conference on Communication, Control, and Computing》|2004年||共10页
会议地点
作者
Gil I. Shamir; University of Illinois at Urbana-Champaign; University of Illinois at Urbana-Champaign;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词

相似文献

外文文献
中文文献
专利

1. ENTROPY-APPROACHABILITY FOR TRANSITIVE MARKOV SHIFTS OVER INFINITE ALPHABET [J] . Takahasi Hiroki Proceedings of the American Mathematical Society . 2020,第9期

机译：及物理马尔可夫的熵 - 过度偏移无限字母表
2. Normal Laws for Two Entropy Estimators on Infinite Alphabets [J] . Chen Chen, Michael Grabchak, Ann Stewart, Entropy . 2018,第5期

机译：无限字母上两个熵估计的正则定律
3. Some Properties of Renyi Entropy over Countably Infinite Alphabets [J] . M. Kovacevic, I. Stanojevic, V. Senk Problems of information transmission . 2013,第2期

机译：无穷无限字母上Renyi熵的一些性质。
4. Sequence-Patterns Entropy and Infinite Alphabets [C] . Gil I. Shamir, University of Illinois at Urbana-Champaign, University of Illinois at Urbana-Champaign Annual Allerton Conference on Communication, Control, and Computing . 2004

机译：序列模式熵和无限字母表
5. Entropy of transformations that preserve an infinite measure. [D] . Bayless, Rachel Louise. 2013

机译：保留无限量度的变换的熵。
6. Subshifts on Infinite Alphabets and Their Entropy [O] . Sharwin Rezagholi 2020

机译：在无限字母表和熵上的子筛选
7. Subshifts on Infinite Alphabets and Their Entropy [O] . Sharwin Rezagholi 2020

机译：在无限字母表和熵上的子筛选

Sequence-Patterns Entropy and Infinite Alphabets

摘要

著录项

相似文献

相关主题

期刊订阅