【24h】

A measure of difference between discrete sample sets

机译:离散样本集之间差异的度量

获取原文
获取原文并翻译 | 示例

摘要

The estimation of statistical distance between populations is a task of importance for many applications. Conventional methods often rely on the use of a maximum-likelihood (ML) estimator, usually due to its analytical and computational simplicity. However, the ML point estimate provides no information about the uncertainty in the parameters and distance estimated, which grows with lesser amounts of observed data. In this paper, a new measure is developed for statistical difference between finite sized sample sets of discrete observations. The measure is defined as the expected distance between probability mass functions (pmfs), with the expectation carried out over Dirichlet posteriors on the pmfs given the observed samples. In contrast to conventional ML estimates of distance, this approach by-design accounts for the uncertainty due to the finite size of the observation sets. In the limit of infinite number of observation samples, the expected distance simplifies to the ML estimate. For finite and small sized sample sets, the expected distance yields a more reliable measure of statistical difference.
机译:人口之间统计距离的估计对于许多应用而言都是重要的任务。常规方法通常由于其分析和计算简单性而常常依赖于最大似然(ML)估计器的使用。但是,ML点估计不提供有关参数不确定性和估计距离的信息,估计不确定性随着距离观测数据量的减少而增加。在本文中,为离散观测的有限大小样本集之间的统计差异开发了一种新的度量。量度定义为概率质量函数(pmfs)之间的预期距离,在给定观察样本的情况下,对pmfs进行Dirichlet后验。与常规的ML距离估计相反,这种设计方法考虑了由于观测集的有限大小而导致的不确定性。在无限数量的观察样本的限制下,预期距离简化为最大似然估计。对于有限和小型样本集,预期距离会产生更可靠的统计差异度量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号