首页> 外文学位 >Learning Large-Scale Conditional Random Fields.
【24h】

Learning Large-Scale Conditional Random Fields.

机译:学习大规模条件随机场。

获取原文
获取原文并翻译 | 示例

摘要

Conditional Random Fields (CRFs) [Lafferty et al., 2001] can offer computational and statistical advantages over generative models, yet traditional CRF parameter and structure learning methods are often too expensive to scale up to large problems. This thesis develops methods capable of learning CRFs for much larger problems. We do so by decomposing learning problems into smaller, simpler subproblems. These decompositions allow us to trade off sample complexity, computational complexity, and potential for parallelization, and we can often optimize these trade-offs in model- or data-specific ways. The resulting methods are theoretically motivated, are often accompanied by strong guarantees, and are effective and highly scalable in practice.;In the first part of our work, we develop core methods for CRF parameter and structure learning. For parameter learning, we analyze several methods and produce PAC learnability results for certain classes of CRFs. Structured composite likelihood estimation proves particularly successful in both theory and practice, and our results offer guidance for optimizing estimator structure. For structure learning, we develop a maximum-weight spanning tree-based method which outperforms other methods for recovering tree CRFs. In the second part of our work, we take advantage of the growing availability of parallel platforms to speed up regression, a key component of our CRF learning methods. Our Shotgun algorithm for parallel regression can achieve near-linear speedups, and extensive experiments show it to be one of the fastest methods for sparse regression.
机译:条件随机场(CRF)[Lafferty et al。,2001]比生成模型具有计算和统计上的优势,但是传统的CRF参数和结构学习方法通​​常过于昂贵,无法扩展到大问题。本文提出了一种能够学习更大问题的CRF的方法。通过将学习问题分解为更小,更简单的子问题来实现。这些分解使我们可以权衡样本复杂性,计算复杂性和并行化的潜力,并且我们通常可以以模型或数据特定的方式来优化这些权衡。由此产生的方法在理论上是有动机的,在实践中常常伴随着强有力的保证,并且在实践中是有效且高度可扩展的。对于参数学习,我们分析了几种方法并针对某些类别的CRF生成了PAC可学习性结果。结构化复合似然估计在理论和实践上都证明是特别成功的,我们的结果为优化估计器结构提供了指导。对于结构学习,我们开发了一种基于最大权重生成树的方法,该方法优于其他用于恢复树CRF的方法。在我们工作的第二部分,我们利用并行平台不断增长的可用性来加快回归速度,这是CRF学习方法的关键组成部分。我们用于并行回归的Shotgun算法可以实现近乎线性的加速,广泛的实验表明,它是稀疏回归的最快方法之一。

著录项

  • 作者

    Bradley, Joseph K.;

  • 作者单位

    Carnegie Mellon University.;

  • 授予单位 Carnegie Mellon University.;
  • 学科 Statistics.;Computer Science.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 136 p.
  • 总页数 136
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号