首页> 外文期刊>Journal of machine learning research >Maximum Entropy Discrimination Markov Networks
【24h】

Maximum Entropy Discrimination Markov Networks

机译:最大熵判别马尔可夫网络

获取原文
       

摘要

The standard maximum margin approach for structured prediction lacksa straightforward probabilistic interpretation of the learningscheme and the prediction rule. Therefore its unique advantages suchas dual sparseness and kernel tricks cannot be easily conjoined withthe merits of a probabilistic model such as Bayesian regularization,model averaging, and ability to model hidden variables. In thispaper, we present a new general framework called maximumentropy discrimination Markov networks (MaxEnDNet, or simply,MEDN), which integrates these two approaches and combines andextends their merits. Major innovations of this approach include: 1)It extends the conventional max-entropy discrimination learning ofclassification rules to a new structural max-entropydiscrimination paradigm of learning a distribution of Markovnetworks. 2) It generalizes the extant Markov networkstructured-prediction rule based on a point estimator of modelcoefficients to an averaging model akin to a Bayesian predictor thatintegrates over a learned posterior distribution of modelcoefficients. 3) It admits flexible entropic regularization of themodel during learning. By plugging in different prior distributionsof the model coefficients, it subsumes the well-known maximum marginMarkov networks (M3N) as a special case, and leads to a modelsimilar to an L1-regularized M3N that is simultaneously primaland dual sparse, or other new types of Markov networks. 4) Itapplies a modular learning algorithm that combines existingvariational inference techniques and convex-optimization basedM3N solvers as subroutines.Essentially, MEDN can be understood as a jointly maximumlikelihood and maximum margin estimate of Markov network. Itrepresents the first successful attempt to combine maximum entropylearning (a dual form of maximum likelihood learning) with maximummargin learning of Markov network for structured input/outputproblems; and the basic principle can be generalized to learningarbitrary graphical models, such as the generative Bayesian networksor models with structured hidden variables. We discuss a number oftheoretical properties of this approach, and show that empirically itoutperforms a wide array of competing methods for structuredinput/output learning on both synthetic and real OCR and web dataextraction data sets. color="gray">
机译:用于结构化预测的标准最大余量方法缺乏对学习方案和预测规则的简单概率解释。因此,它的独特优势(例如双重稀疏性和核技巧)不能轻易地与概率模型的优点(例如贝叶斯正则化,模型平均以及对隐藏变量进行建模的能力)相结合。在本文中,我们提出了一个新的通用框架,称为极大化歧视Markov网络(MaxEnDNet,或简称为MEDN),该框架整合了这两种方法,并结合并扩展了它们的优点。该方法的主要创新包括:1)将分类规则的传统最大熵判别学习扩展到学习马尔可夫网络分布的新的结构最大熵判别范式。 2)将基于模型系数的点估计量的现存Markov网络结构化预测规则推广到类似于贝叶斯预测变量的平均模型,该模型对学习到的模型系数的后验分布进行积分。 3)允许在学习过程中对模型进行灵活的熵正则化。通过插入模型系数的不同先验分布,它包含了众所周知的最大余量马尔可夫网络(M 3 N)作为特例,并导致了与 L 1 -正则化的M 3 N,它同时是primaland对偶稀疏或其他新型马尔可夫网络。 4)采用模块化的学习算法,将已有的变分推理技术和基于凸优化的M 3 N求解器相结合作为子程序。从本质上,MEDN可以理解为马尔可夫网络的最大似然和最大余量估计。它代表了针对结构化输入/输出问题将最大熵学习(最大似然学习的对偶形式)与马尔可夫网络的最大余量学习相结合的首次成功尝试;基本原理可以推广到学习任意图形模型,例如生成贝叶斯网络或具有结构化隐藏变量的模型。我们讨论了这种方法的许多理论特性,并显示出在经验上,它优于在合成和实际OCR和Web数据提取数据集上进行结构化输入/输出学习的各种竞争方法。 color =“ gray”>

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号