首页> 外文期刊>LIPIcs : Leibniz International Proceedings in Informatics >Using Statistical Encoding to Achieve Tree Succinctness Never Seen Before
【24h】

Using Statistical Encoding to Achieve Tree Succinctness Never Seen Before

机译:使用统计编码来实现以前从未见过的树木

获取原文
获取外文期刊封面目录资料

摘要

We propose new entropy measures for trees, the known ones are H_k(e?'ˉ), the k-th order (tree label) entropy (Ferragina at al. 2005), and tree entropy H(e?'ˉ) (Jansson et al. 2006), the former considers only the tree labels and the latter only tree shape. The proposed entropy measures, H_k(e?'ˉ L) and H_k(L e?'ˉ), exploit the relation between the labels and the tree shape. We prove that they lower bound label entropy and tree entropy, respectively, i.e. H_k(e?'ˉ L) a?¤ H(e?'ˉ) and H_k(L e?'ˉ) a?¤ H_k(L). Besides being theoretically superior, the new measures are significantly smaller in practice. We also propose a new succinct representation of labeled trees which represents a tree T using one of the following bounds: T (H(e?'ˉ) + H_k(L e?'ˉ)) or T (H_k(e?'ˉ L) + H_k(L)). The representation is based on a new, simple method of partitioning the tree, which preserves both tree shape and node degrees. The previous state-of-the-art method of compressing the tree achieved T (H(e?'ˉ) + H_k(L)) bits, by combining the results of Ferragina at al. 2005 and Jansson et al. 2006; so proposed representation is not worse and often superior. Moreover, our representation supports standard tree navigation in constant time as well as more complex queries. Such a structure achieving this space bounds was not known before: aforementioned solution only worked for compression alone, our structure is the first which achieves H_k(e?'ˉ) for k0 and supports such queries. Lastly, our data structure is fairly simple, both conceptually and in terms of the implementation, moreover it uses known tools, which is a counter-argument to the claim that methods based on tree-partitioning are impractical.
机译:我们提出了树木的新熵措施,已知的措施是H_K(e?'ˉ),第k令(树标签)熵(Al。2005的Ferragina),以及树熵h(e?'ˉ)(jansson et al。2006),前者仅考虑树标签和后者的树形形状。所提出的熵措施,H_K(E?'ˉL)和h_k(l e?'ˉ),利用标签和树形之间的关系。我们证明他们分别下限,分别下绑定标签熵和树熵,即H_K(e?'ˉ1)a?¤h(e?'ˉ)和h_k(l e?'ˉ)a?¤h_k(l)。除了理论上优越,实践中的新措施明显更小。我们还提出了一种新的简洁表示,标记的树木表示使用以下界限之一表示树T:t(h(e≤'ˉ)+ h_k(le≤h'm)或t(h_k(e?'ˉ l)+ h_k(l))。表示基于一种新的简单方法来分区树,它保留了树形和节点度。通过组合A1的结果,通过组合Al的结果来实现T(H(e≤')+ h_k(l))比特的先前最先进的方法。 2005年和jansson等人。 2006;所以提出的代表性并不差,通常是优越的。此外,我们的表示支持恒定时间和更复杂的查询中的标准树导航。在实现这种空间界限的这种结构上未知:上述解决方案仅为单独进行压缩,我们的结构是第一个,它为k> 0实现了H_K(e?'ˉ)并支持这些查询。最后,我们的数据结构在概念上和实现方面都很简单,而且它使用了已知的工具,这是一个反参数,对基于树分区的方法是不切实际的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号