首页> 外文学位 >Model checking for incomplete high-dimensional categorical data (Incomplete data).
【24h】

Model checking for incomplete high-dimensional categorical data (Incomplete data).

机译:对不完整的高维分类数据(不完整的数据)进行模型检查。

获取原文
获取原文并翻译 | 示例

摘要

Categorical data are often arranged in a contingency table and summarized by a loglinear model. A standard approach for comparing two competing models is to calculate twice the discrepancy between maximized loglikelihoods, which follows a χ2 distribution asymptotically. But when data are sparse, the χ2 approximation may be questionable.; As an alternative to a large-sample approximation to the reference distribution, we implement the framework introduced by Rubin (1984) for finding the posterior predictive check (PPC) distribution. The PPC distribution represents the conditional probability of a future value of a test statistic based on the information given by observed data along with model specifications, which can serve as the reference distribution for the relevant likelihood-ratio statistics.; However, it can be computationally demanding to construct a PPC distribution based on a large number of replicates. This is especially the case when the original data are incomplete, since generation of each PPC replicate requires an involved statistical computing approach (we use a data-augmentation strategy). In practice, we propose to approximate the PPC distribution by a gamma distribution whose parameters are estimated by a combination of training data and a modest-sized sample of PPC replicates. Some simulated examples suggest that this procedure, which can reduce the computation needed to approximate the PPC distribution by a factor of 20, has satisfactory statistical properties.
机译:分类数据通常排列在列联表中,并通过对数线性模型进行汇总。比较两个竞争模型的一种标准方法是,计算最大对数似然之间的差异,该差异是渐近遵循χ 2 分布的。但是,当数据稀疏时,χ 2 近似值可能会令人怀疑。作为参考分布的大样本近似值的替代方法,我们采用了Rubin(1984)引入的框架来查找后验预测检查(PPC)分布。 PPC分布表示基于观察数据给出的信息以及模型规范的检验统计量的未来值的条件概率,可以用作相关似然比统计的参考分布。但是,在计算上可能需要基于大量重复来构建PPC分发。当原始数据不完整时尤其如此,因为每个PPC复制的生成都需要一种涉及的统计计算方法(我们使用数据增强策略)。实际上,我们建议通过伽玛分布来近似PPC分布,该伽玛分布的参数是通过训练数据和中等大小的PPC复制样本的组合来估算的。一些模拟示例表明,该程序可以将PPC分布的近似值减少20倍,具有令人满意的统计特性。

著录项

  • 作者

    Hu, Ming-Yi.;

  • 作者单位

    University of California, Los Angeles.;

  • 授予单位 University of California, Los Angeles.;
  • 学科 Statistics.; Mathematics.; Education Mathematics.
  • 学位 Ph.D.
  • 年度 1999
  • 页码 87 p.
  • 总页数 87
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 统计学;数学;
  • 关键词

  • 入库时间 2022-08-17 11:47:56

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号