Learning maximum entropy models from finite-size data sets: A fast data-driven algorithm allows sampling from the posterior distribution

Ulisse Ferrari

首页> 外文期刊>Physical review, E >Learning maximum entropy models from finite-size data sets: A fast data-driven algorithm allows sampling from the posterior distribution

【24h】

Learning maximum entropy models from finite-size data sets: A fast data-driven algorithm allows sampling from the posterior distribution

机译：从有限尺寸数据集中学习最大熵模型：快速数据驱动算法允许从后部分布采样

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Maximum entropy models provide the least constrained probability distributions that reproduce statistical properties of experimental datasets. In this work we characterize the learning dynamics that maximizes the log-likelihood in the case of large but finite datasets. We first show how the steepest descent dynamics is not optimal as it is slowed down by the inhomogeneous curvature of the model parameters' space. We then provide a way for rectifying this space which relies only on dataset properties and does not require large computational efforts. We conclude by solving the long-time limit of the parameters' dynamics including the randomness generated by the systematic use of Gibbs sampling. In this stochastic framework, rather than converging to a fixed point, the dynamics reaches a stationary distribution, which for the rectified dynamics reproduces the posterior distribution of the parameters. We sum up all these insights in a “rectified” data-driven algorithm that is fast and by sampling from the parameters' posterior avoids both under- and overfitting along all the directions of the parameters' space. Through the learning of pairwise Ising models from the recording of a large population of retina neurons, we show how our algorithm outperforms the steepest descent method.

机译：最大熵模型提供了重现实验数据集的统计特性的最小约束概率分布。在这项工作中，我们将学习动态表征在大而有限的数据集的情况下最大化的学习动态。我们首先展示最陡的下降动力学是如何由于模型参数空间的不均匀曲率减慢而最佳的。然后，我们提供了一种纠正这个空间，该空间仅依赖于数据集属性并且不需要大的计算工作。我们通过解决参数动态的长期限制来得出结论，包括通过系统使用GIBBS采样产生的随机性。在该随机框架中，而不是将变点融合到一个固定点，动力学达到静止分布，其对于整流动力学再现参数的后部分布。我们总结了“整流”数据驱动算法中的所有这些见解，这些算法快速，通过从参数的后续采样，避免沿着参数空间的所有方向均匀和过度拟合。通过学习成对课程模型从录制大量的视网膜神经元，我们展示了我们的算法如何优于最陡的下降方法。

著录项

来源
《Physical review, E》 |2016年第2期|共13页
作者
Ulisse Ferrari;
展开▼
作者单位

Sorbonne Universités UPMC Univ Paris 06 INSERM CNRS Institut de la Vision 17 rue Moreau 75012 Paris France;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类统计物理学;等离子体物理学;流体力学;
关键词
Learning; maximum; entropy;

机译：学习;最大;熵;

相似文献

外文文献
中文文献
专利

1. Learning maximum entropy models from finite-size data sets: A fast data-driven algorithm allows sampling from the posterior distribution [J] . Ulisse Ferrari Physical review, E . 2016,第2aPta2期

机译：从有限尺寸数据集中学习最大熵模型：快速数据驱动算法允许从后部分布采样
2. Identification and sampling of Bayesian posteriors of high-dimensional symmetric positive-definite matrices for data-driven updating of computational models [J] . Arnst M., Soize C. Computer Methods in Applied Mechanics and Engineering . 2019,第AUGa1期

机译：数据驱动更新的高维对称正定矩阵的贝叶斯后验的标识和采样
3. Identification and sampling of Bayesian posteriors of high-dimensional symmetric positive-definite matrices for data-driven updating of computational models [J] . Arnst M., Soize C. Computer Methods in Applied Mechanics and Engineering . 2019,第Auga1期

机译：用于数据驱动的计算模型的高维对称正面矩阵的贝叶斯海底探测和取样
4. Self-Adapted and Filtered Qualitative Maximum a Posterior Algorithm for Small Data Sets [C] . Hui Cao, Xiaoguang Gao IEEE International Conference on High Performance Computing and Communications;IEEE International Conference on Smart City;IEEE International Conference on Data Science and Systems . 2018

机译：自适应和滤波的定性最大后验算法的小数据集
5. Maximum entropy density estimation and modeling geographic distributions of species. [D] . Dudik, Miroslav. 2007

机译：最大熵密度估计和物种物种地理分布建模。
6. An algorithm for learning maximum entropy probability models of disease risk that efficiently searches and sparingly encodes multilocus genomic interactions [O] . David J. Miller, Yanxin Zhang, Guoqiang Yu, -1

机译：一种用于学习疾病风险的最大熵概率模型的算法可以有效地搜索并少量编码多基因座基因组相互作用
7. Learning maximum entropy models from finite-size data sets: A fast data-driven algorithm allows sampling from the posterior distribution [O] . Ferrari, Ulisse 2016

机译：从有限大小的数据集中学习最大熵模型：一种快速的数据驱动算法允许从后验分布中进行采样

Learning maximum entropy models from finite-size data sets: A fast data-driven algorithm allows sampling from the posterior distribution

摘要

著录项

相似文献

相关主题

期刊订阅