首页> 外文OA文献 >A Maximum-Entropy Method to Estimate Discrete Distributions from Samples Ensuring Nonzero Probabilities

【2h】

A Maximum-Entropy Method to Estimate Discrete Distributions from Samples Ensuring Nonzero Probabilities

机译：最大熵方法来估计来自样本的离散分布，确保非零概率

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

When constructing discrete (binned) distributions from samples of a data set, applications exist where it is desirable to assure that all bins of the sample distribution have nonzero probability. For example, if the sample distribution is part of a predictive model for which we require returning a response for the entire codomain, or if we use Kullback–Leibler divergence to measure the (dis-)agreement of the sample distribution and the original distribution of the variable, which, in the described case, is inconveniently infinite. Several sample-based distribution estimators exist which assure nonzero bin probability, such as adding one counter to each zero-probability bin of the sample histogram, adding a small probability to the sample pdf, smoothing methods such as Kernel-density smoothing, or Bayesian approaches based on the Dirichlet and Multinomial distribution. Here, we suggest and test an approach based on the Clopper–Pearson method, which makes use of the binominal distribution. Based on the sample distribution, confidence intervals for bin-occupation probability are calculated. The mean of each confidence interval is a strictly positive estimator of the true bin-occupation probability and is convergent with increasing sample size. For small samples, it converges towards a uniform distribution, i.e., the method effectively applies a maximum entropy approach. We apply this nonzero method and four alternative sample-based distribution estimators to a range of typical distributions (uniform, Dirac, normal, multimodal, and irregular) and measure the effect with Kullback–Leibler divergence. While the performance of each method strongly depends on the distribution type it is applied to, on average, and especially for small sample sizes, the nonzero, the simple “add one counter”, and the Bayesian Dirichlet-multinomial model show very similar behavior and perform best. We conclude that, when estimating distributions without an a priori idea of their shape, applying one of these methods is favorable.

机译：当从数据集的样本构造离散（Binned）分布时，存在应用，其中期望确保样品分布的所有箱具有非零概率。例如，如果样本分布是我们需要返回整个Codomain的响应的预测模型的一部分，或者我们使用Kullback-Leibler发散来衡量样品分布的（DIS）协议和原始分配在所述情况下，该变量是不方便的无限的。存在许多基于样品的分布估计器，其确保非零箱概率，例如将一个计数器添加到样本直方图的每个零概率箱，为样品PDF添加小概率，平滑方法，如核密度平滑，或贝叶斯方法基于Dirichlet和多项分布。在这里，我们建议并测试了一种基于钢板 - Pearson方法的方法，这是利用二聚体分布。基于样品分布，计算箱占用概率的置信区间。每个置信区间的平均值是真正的箱占概率的严格正估计器，并且随着样本大小的增加是会聚。对于小型样品，它会聚朝向均匀的分布，即，该方法有效地应用最大熵方法。我们将该非零方法和基于四个替代的样本的分布估计应用于一系列典型分布（均匀，DIRAC，正常，多式联运和不规则），并测量与Kullback-Leibler发散的效果。虽然每种方法的性能强烈取决于它适用于平均的分布类型，但特别是对于小型样本尺寸，非零，简单的“添加一个计数器”，而贝叶斯Dirichlet-Multimalial模型显示出非常相似的行为和表现最好。我们得出结论，当估计分布时，无需先验到其形状，应用其中一个方法是有利的。

著录项

作者
Paul Darscheid; Anneli Guthke; Uwe Ehret;
展开▼
作者单位

展开▼
年度 2018
总页数
原文格式 PDF
正文语种 eng
中图分类

相似文献

外文文献
中文文献
专利

1. A Maximum-Entropy Method to Estimate Discrete Distributions from Samples Ensuring Nonzero Probabilities [J] . Paul Darscheid, Anneli Guthke, Uwe Ehret Entropy . 2018,第8期

机译：估计非零概率的样本离散分布的最大熵方法
2. Modeling Non-Equilibrium Dynamics of a Discrete Probability Distribution: General Rate Equation for Maximal Entropy Generation in a Maximum-Entropy Landscape with Time-Dependent Constraints [J] . Gian Paolo Beretta Entropy . 2008,第3期

机译：离散概率分布的非平衡动力学建模：具有时间相关约束的最大熵景观中最大熵生成的一般速率方程
3. A METHOD TO COMBINE NON-PROBABILITY SAMPLE DATA WITH PROBABILITY SAMPLE DATA IN ESTIMATING SPATIAL MEANS OF ENVIRONMENTAL VARIABLES [J] . D. J. BRUS, J. J. DE GRUIJTER Environmental Monitoring and Assessment . 2003,第3期

机译：一种估计环境变量空间均值的非概率样本数据与概率样本数据的组合方法
4. A NEW METHOD OF MOMENTS FOR REDUCING THE SAMPLING VARIABILITY OF ESTIMATED VALUES OF PROBABILITY DISTRIBUTION PARAMETERS [C] . G. Najafian International Conference on Offshore Mechanics and Arctic Engineering 2007(OMAE2007) vol.2; 20070610-15; San Deigo,CA(US) . 2007

机译：减小概率分布参数估计值抽样变异性的矩的新方法
5. A methodology to empirically derive the sampling distribution of the sample mean for any given probability density function [D] . David, Pierre Alain. 1998

机译：对任何给定的概率密度函数凭经验得出样本均值的采样分布的方法
6. Original article: Men who have sex with men in Great Britain: comparing methods and estimates from probability and convenience sample surveys [O] . Philip Prah, Ford Hickson, Chris Bonell, -1

机译：原始文章：在英国与男性发生性关系的男性：比较概率抽样调查和便利抽样调查中的方法和估计
7. Modeling Non-Equilibrium Dynamics of a Discrete Probability Distribution: General Rate Equation for Maximal Entropy Generation in a Maximum-Entropy Landscape with Time-Dependent Constraints [O] . Gian Paolo Beretta 2008

机译：离散概率分布的非平衡动力学建模：具有时间依赖约束的最大熵景观中极大熵产生的一般速率方程
8. The Proportional Closeness and the Expected Sample Size of Sequential Procedures for Estimating Tail Probabilities in Exponential Distributions [R] . Zacks, S. 1973

机译：用于估计指数分布中尾概率的序贯程序的比例接近度和预期样本大小

A Maximum-Entropy Method to Estimate Discrete Distributions from Samples Ensuring Nonzero Probabilities

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅