首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Information Theoretic Optimal Learning of Gaussian Graphical Models
【24h】

Information Theoretic Optimal Learning of Gaussian Graphical Models

机译:高斯图形模型的信息理论优化学习

获取原文
       

摘要

What is the optimal number of independent observations from which a sparse Gaussian Graphical Model can be correctly recovered? Information-theoretic arguments provide a lower bound on the minimum number of samples necessary to perfectly identify the support of any multivariate normal distribution as a function of model parameters. For a model defined on a sparse graph with $p$ nodes, a maximum degree $d$ and minimum normalized edge strength $kappa$, this necessary number of samples scales at least as $d log p/kappa^2$. The sample complexity requirements of existing methods for perfect graph reconstruction exhibit dependency on additional parameters that do not enter in the lower bound. The question of whether the lower bound is tight and achievable by a polynomial time algorithm remains open. In this paper, we constructively answer this question and propose an algorithm, termed DICE, whose sample complexity matches the information-theoretic lower bound up to a universal constant factor. We also propose a related algorithm SLICE that has a slightly higher sample complexity, but can be implemented as a mixed integer quadratic program which makes it attractive in practice. Importantly, SLICE retains a critical advantage of DICE in that its sample complexity only depends on quantities present in the information theoretic lower bound. We anticipate that this result will stimulate future search of computationally efficient sample-optimal algorithms.
机译:可以正确恢复稀疏高斯图形模型的独立观测的最佳数量是多少?信息 - 理论参数在最小数量的样本中提供了一个下限,以便完美地标识任何多元正常分布的支持,作为模型参数的函数。对于以$ P $节点的稀疏图定义的模型,最大程度为$ D $和最小归一化边缘强度$ kappa $,此必要数量的样本量至少随着$ d log p / kappa ^ 2 $ 。完美图形重建的现有方法的示例复杂性要求呈现对不进入下限的其他参数的依赖性。通过多项式时间算法紧密和可实现下限的问题仍然是开放的。在本文中,我们建设性地回答了这个问题并提出了一种称为骰子的算法,其样本复杂性与通用恒定因子的信息定义下限匹配。我们还提出了一种相关的算法切片,其具有稍高的样本复杂性,但可以实现为混合整数二次程序,这使其在实践中具有吸引力。重要的是,切片保留了骰子的关键优势,因为其样本复杂性仅取决于信息理论下限的数量。我们预计此结果将刺激未来的计算上高效的样本最佳算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号