首页> 外文学位 >Adjusting for covariates in zero-inflated gamma and zero-inflated log-normal models for semicontinuous data.
【24h】

Adjusting for covariates in zero-inflated gamma and zero-inflated log-normal models for semicontinuous data.

机译:调整半连续数据的零膨胀gamma和零膨胀对数正态模型中的协变量。

获取原文
获取原文并翻译 | 示例

摘要

Semicontinuous data consist of a combination of a point-mass at zero and a positive skewed distribution. This type of non-negative data distribution is found in data from many fields, but presents unique challenges for analysis. Specifically, these data cannot be analyzed using positive distributions, but distributions that are unbounded are also likely a poor fit. Two-part models incorporate both the zero values from semicontinuous data and the positive continuous values. In this dissertation, we compare zero-inflated gamma (ZIG) and zero-inflated log-normal (ZILN) two-part models. For both of these models, the probability that an outcome is non-zero is modeled via logistic regression. Then the distribution of the non-zero outcomes is modeled via gamma regression with a log-link for ZIG regression and via log-normal regression for ZILN.;In this dissertation we propose tests which combine the two parts of the ZIG and ZILN models in meaningful ways for performing a two group comparison. Then we compare these tests in terms of observed Type 1 error rates and power levels under both correctly specified and misspecified ZIG and ZILN models. Tests falling under two main hypotheses are examined. First, we look at two-part tests which come from a two-part hypothesis of no difference between the two groups in terms of the probability of non-zero values and in terms of the the mean of the non-zero values. The second type of tests are mean-based tests. These combine the two parts of the model in ways related to the overall group means of the semicontinuous variable. When not adjusting for covariates, two tests are developed based on a difference of means (DM) and a ratio of means (RM). When adjusting for covariates, tests using mean-based hypotheses are developed which marginalize over the values of the adjusting covariates. Under the adjusting framework, two ratio of means statistics are proposed and examined, an average of the subject specific ratio of means (RMSS) and a ratio of the marginal group means (RMMAR). Simulations are used to compare Type 1 error and power for these tests and standard two group comparison tests.;Simulation results show that when ZIG and ZILN models are misspecified and the coefficient of variation (CoV) and/or sample size is large, there are differences in Type 1 error and power results between the misspecified and correctly specified models. Specifically, when ZILN data with high CoV or sample size are analyzed as ZIG, Type 1 error rates are prohibitively high. On the other hand, when ZIG data are analyzed as ZILN under these scenarios, power levels are much lower for ZILN analyses than for ZIG analyses. Examination of Q-Q plots show, however, that in these settings, distinguishing between ZIG and ZILN data can be relatively straightforward. When the coefficient of variation is small it is harder to distinguish between ZIG and ZILN models, but the differences between Type 1 error rates and power levels for misspecified or correctly specified models is also slight.;Finally, we use the proposed methods to analyze a data set involving Parkinson's disease (PD) and driving. A number of these methods show that PD subjects exhibit poorer lane keeping ability than control subjects.
机译:半连续数据由零点质量和正偏分布组成。在许多领域的数据中都可以找到这种非负数据分布,但是这对分析提出了独特的挑战。具体而言,无法使用正分布来分析这些数据,但是无界分布也可能不适合。两部分模型包含半连续数据的零值和正连续值。在本文中,我们比较了零膨胀伽玛(ZIG)模型和零膨胀对数正态(ZILN)两部分模型。对于这两个模型,结果都是非零的概率是通过逻辑回归建模的。然后通过伽玛回归,ZIG回归的对数链接和ZILN的对数正态回归对非零结果的分布进行建模;本文提出了将ZIG和ZILN模型的两个部分结合起来的检验进行两组比较的有意义的方法。然后,我们根据在正确指定和错误指定的ZIG和ZILN模型下观察到的1型错误率和功率水平来比较这些测试。检验两个主要假设下的检验。首先,我们看一下由两部分组成的假设的两部分检验,该假设在非零值的概率和非零值的均值方面两组之间没有差异。第二类测试是基于均值的测试。这些以与半连续变量的整体组均值相关的方式组合了模型的两个部分。当不调整协变量时,将基于均值差(DM)和均值比(RM)开发两个检验。在调整协变量时,开发了使用基于均值假设的检验,这些假设在调整协变量的值上处于边际位置。在调整框架下,提出并检查了两种均值统计比率,即主体特定均值比率(RMSS)的平均值和边缘群体均值比率(RMMAR)。仿真用于比较这些测试和标准的两组比较测试的1型误差和功效。;仿真结果表明,当ZIG和ZILN模型指定不正确且变异系数(CoV)和/或样本量很大时,存在错误指定型号和正确指定型号之间的Type 1错误和功率差异。具体而言,当将具有较高CoV或样本量的ZILN数据分析为ZIG时,类型1的错误率过高。另一方面,当在这些情况下将ZIG数据分析为ZILN时,ZILN分析的功率水平比ZIG分析的功率水平低得多。然而,对Q-Q图的检查表明,在这些设置中,区分ZIG和ZILN数据可能相对简单。当变异系数较小时,很难区分ZIG模型和ZILN模型,但对于错误指定或正确指定的模型,类型1的错误率和功率水平之间的差异也很小。最后,我们使用提出的方法来分析涉及帕金森病(PD)和驾驶的数据集。这些方法中的许多方法表明,PD受试者的车道保持能力比对照受试者差。

著录项

  • 作者

    Mills, Elizabeth Dastrup.;

  • 作者单位

    The University of Iowa.;

  • 授予单位 The University of Iowa.;
  • 学科 Biology Biostatistics.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 302 p.
  • 总页数 302
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号