首页> 外文OA文献 >Mixtures of Exponential Distributions to Describe the Distribution of Poisson Means in Estimating the Number of Unobserved Classes
【2h】

Mixtures of Exponential Distributions to Describe the Distribution of Poisson Means in Estimating the Number of Unobserved Classes

机译:指数分布的混合物,用于描述估计未观测类数时的泊松均值分布

摘要

In many fields of study scientists are interested in estimating thenumber of unobserved classes. A biologist may want to find thenumber of rare species of an animal population in order to conserve,manage, and monitor biodiversity; a library manager may want to knowhow many non-circulating items are present in a library system; or aclinical investigator may want to determine the number of unseendisease occurrences. A traditional way of estimating an unknownnumber of classes is by using a negative binomial model (Fisher,Corbet, and Williams 1943). The negative binomial model is based onassuming that the numbers of individuals from each class areindependent Poisson samples, and that the means of these Poissonrandom variables follow a Gamma distribution. This thesis proposesa parametric model where the law of the mean frequency of classes isa finite mixture of exponential distributions. The proposed modelhas the following advantages: model simplicity, efficientcomputation using the EM algorithm, and a straightforwardinterpretation of the fitted model. Also, model assessment by wayof a chi-squared goodness of fit procedure can be used, a benefitthis parametric model has over other commonly used non-parametricmethods.A main accomplishment of this thesis is providing an efficientcomputation of maximum likelihood (ML) estimates for the proposedmodel. Without use of the EM algorithm, finding ML estimates forthis model can be difficult and time consuming. The likelihoodfunction is complicated due to high dimensionality andnon-identifiability of certain parameters. Within the M step of ouralgorithm we embed another EM, which can effortlessly maximize theparameters in the finite mixture. We refer to the algorithm as anested EM. Aitken's acceleration is used to increase speed of thealgorithm.Microbial samples from the coast of Massachusetts Bay near Nahant,Massachusetts are used to demonstrate data analysis using threedifferent numbers of components in the finite mixture of the modeldescribed. It is shown that the model produces reasonable estimatesand fits the data satisfactorily. This model has recently beenpremiered in species richness estimation (Hong et al. 2006),and its many advantages show promise for continued usein estimating the number of unobserved classes.
机译:在许多研究领域中,科学家对估计未观察到的类别的数量感兴趣。生物学家可能想找到动物种群中稀有物种的数量,以保护,管理和监测生物多样性;图书馆管理员可能想知道图书馆系统中有多少非流通项目;或临床研究者可能想要确定无病发病的次数。估计类别数量未知的传统方法是使用负二项式模型(Fisher,Corbet和Williams 1943)。负二项式模型基于以下假设:假设每个类别中的个体数量都是独立的泊松样本,并且这些泊松随机变量的均值遵循Gamma分布。本文提出了一个参数模型,其中类的平均频率定律是指数分布的有限混合。所提出的模型具有以下优点:模型简单,使用EM算法的有效计算以及对拟合模型的直接解释。而且,可以使用通过卡方拟合优度过程进行模型评估,该参数模型相对于其他常用的非参数方法具有优势。本论文的主要成就是为模型的最大似然估计提供了有效的计算方法。建议的模型。如果不使用EM算法,则很难为此模型找到ML估计值。由于高维和某些参数的不可识别性,似然函数很复杂。在算法的M步内,我们嵌入了另一个EM,它可以毫不费力地使有限混合中的参数最大化。我们称该算法为ested EM。 Aitken的加速度用于提高算法的速度。来自马萨诸塞州Nahant附近的马萨诸塞州海湾海岸的微生物样本用于说明在所述模型的有限混合中使用三种不同数量的成分进行的数据分析。结果表明,该模型能得出合理的估计值,并能令人满意地拟合数据。该模型最近在物种丰富度估计中首屈一指(Hong等人,2006年),其许多优点表明有希望继续用于估计未观测类别的数量。

著录项

  • 作者

    Barger Kathryn Jo-Anne;

  • 作者单位
  • 年度 2006
  • 总页数
  • 原文格式 PDF
  • 正文语种 en_US
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号