首页> 外文学位 >Extremal Entropy: Information Geometry, Numerical Entropy Mapping, and Machine Learning Application of Associated Conditional Independences.
【24h】

Extremal Entropy: Information Geometry, Numerical Entropy Mapping, and Machine Learning Application of Associated Conditional Independences.

机译:极值熵:信息几何,数值熵映射和相关条件独立性的机器学习应用。

获取原文
获取原文并翻译 | 示例

摘要

Entropy and conditional mutual information are the key quantities information theory provides to measure uncertainty of and independence relations between random variables. While these measures are key to diverse areas such as physics, communication, signal processing, and machine learning, surprisingly there is still much about them that is yet unknown. This thesis explores some of this unknown territory, ranging from tackling fundamental questions involving the interdependence between entropies of different subsets of random variables via the characterization of the region of entropic vectors, to applied questions involving how conditional independences can be harnessed to improve the efficiency of supervised learning in discrete valued datasets.;The region of entropic vectors is a convex cone that has been shown to be at the core of many fundamental limits for problems in multiterminal data compression, network coding, and multimedia transmission. This cone has been shown to be non-polyhedral for four or more random variables, however its boundary remains unknown for four or more discrete random variables. We prove that only one form of nonlinear non-shannon inequality is necessary to fully characterize the region for four random variables. We identify this inequality in terms of a function that is the solution to an optimization problem. We also give some symmetry and convexity properties of this function which rely on the structure of the region of entropic vectors and Ingleton inequalities. Methods for specifying probability distributions that are in faces and on the boundary of the convex cone are derived, then utilized to map optimized inner bounds to the unknown part of the entropy region. The first method utilizes tools and algorithms from abstract algebra to efficiently determine those supports for the joint probability mass functions for four or more random variables that can, for some appropriate set of non-zero probabilities, yield entropic vectors in the gap between the best known inner and outer bounds. These supports are utilized, together with numerical optimization over non-zero probabilities, to provide inner bounds to the unknown part of the entropy region. Next, information geometry is utilized to parameterize and study the structure of probability distributions on these supports yielding entropic vectors in the faces of entropy and in the unknown part of the entropy region.;In the final section of the thesis, we propose score functions based on entropy and conditional mutual information as components in partition strategies for supervised learning of datasets with discrete valued features. Partitioning the data enables a reduction in the complexity of training and testing on large datasets. We demonstrate that such partition strategies can also be efficient in the sense that when the training and testing datasets are split according to them, and the blocks in the partition are processed separately, the classification performance is comparable to, or better than, the performance when the data are not partitioned at all.
机译:熵和条件互信息是信息理论为测量随机变量之间的不确定性和独立性关系提供的关键量。尽管这些措施是物理,通信,信号处理和机器学习等各个领域的关键,但令人惊讶的是,这些措施仍有很多未知之处。本文探索了一些未知领域,从解决涉及熵变量矢量区域特征的基本问题,涉及解决随机变量不同子集的熵之间相互依存关系的基本问题,到涉及如何利用条件独立性来提高效率的应用问题。熵向量的区域是一个凸锥,它被证明是解决多终端数据压缩,网络编码和多媒体传输问题的许多基本限制的核心。对于四个或更多随机变量,该圆锥已显示为非多面体,但是对于四个或更多离散随机变量,其边界仍然未知。我们证明,仅需要一种非线性非香农不等式即可完全表征四个随机变量的区域。我们根据作为优化问题解决方案的函数来识别这种不等式。我们还给出了此函数的一些对称性和凸性,这些属性依赖于熵矢量和Ingleton不等式区域的结构。推导了指定在凸锥的面中和边界上的概率分布的方法,然后将其用于将优化的内部边界映射到熵区域的未知部分。第一种方法利用抽象代数的工具和算法来有效地确定四个或多个随机变量的联合概率质量函数的那些支持,对于某些适当的非零概率集合,这些函数可以在最著名的概率之间的间隙中产生熵矢量内部和外部界限。利用这些支持以及对非零概率的数值优化,为熵区域的未知部分提供内边界。接下来,利用信息几何学对这些支持物上的概率分布进行参数化和研究,从而在熵的面和熵区域的未知部分产生熵矢量。熵和条件互信息作为分区策略中具有离散值特征的数据集监督学习的组成部分。对数据进行分区可以降低对大型数据集进行训练和测试的复杂性。我们证明了这样的分区策略在以下方面也是有效的:将训练和测试数据集根据它们进行拆分,并且分别处理分区中的块时,分类性能可媲美或优于以下情况下的性能:数据根本没有分区。

著录项

  • 作者

    Liu, Yunshu.;

  • 作者单位

    Drexel University.;

  • 授予单位 Drexel University.;
  • 学科 Electrical engineering.
  • 学位 Ph.D.
  • 年度 2016
  • 页码 110 p.
  • 总页数 110
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号