首页> 外文学位 >Statistical methods for multi-state analysis of incomplete longitudinal data.
【24h】

Statistical methods for multi-state analysis of incomplete longitudinal data.

机译:不完整纵向数据的多状态分析的统计方法。

获取原文
获取原文并翻译 | 示例

摘要

Analyses of longitudinal categorical data are typically based on semiparametric models in which covariate effects are expressed on marginal probabilities and estimation is carried out based on generalized estimating equations (GEE). Methods based on GEE are motivated in part by the lack of tractable models for clustered categorical data. However such marginal methods may not yield fully efficient estimates, nor consistent estimates when missing data are present. In the first part of the thesis I develop a Markov model for the analysis of longitudinal categorical data which facilitates modeling marginal and conditional structures. A likelihood formulation is employed for inference, so the resulting estimators enjoy properties such as optimal efficiency and consistency, and remain consistent when data are missing at random. Simulation studies demonstrate that the proposed method performs well under a variety of situations. Application to data from a smoking prevention study illustrates the utility of the model and interpretation of covariate effects.;In practice, we often face data with missing values in both the response and the covariates, and sometimes there is some association between the missingness of the response and the covariate. The proper analysis of this type of data requires taking this correlation into consideration. The impact of attrition in longitudinal studies depends on the correlation between the missing response and missing covariate. Ignoring such correlation can bias the statistical inference. We have studied the proper method that incorporates the association between the missingness of the response and missing covariate through the use of inverse probability weighted generalized estimating equations. The simulation illustrates that the proposed method yields a consistent estimator, while the method that ignores the association yields an inconsistent estimator.;Many analyses for longitudinal incomplete data focus on studying the impact of covariates on the mean responses. However, little attention has been directed to address the impact of missing covariates on the association parameters in clustered longitudinal studies. The last part of this thesis mainly addresses this problem. Weighted first and second order estimating equations are constructed to obtain consistent estimates of mean and association parameters.;Incomplete data often arise in many areas of research in practice. This phenomenon is common in longitudinal data on disease history of subjects. Progressive models provide a convenient framework for characterizing disease processes which arise, for example, when the state represents the degree of the irreversible damage incurred by the subject. Problems arise if the mechanism leading to the missing data is related to the response process. A naive analysis might lead to biased results and invalid inferences. The second part of this thesis begins with an investigation of progressive multi-state models for longitudinal studies with incomplete observations. Maximum likelihood estimation is carried out based on an EM algorithm, and variance estimation is provided using Louis method. In general, the maximum likelihood estimates are valid when the missing data mechanism is missing completely at random or missing at random. Here we provide likelihood based method in that the parameters are identifiable no matter what the missing data mechanism. Simulation studies demonstrate that the proposed method works well under a variety of situations.
机译:纵向分类数据的分析通常基于半参数模型,在半参数模型中,对边际概率表示协变量效应,并基于广义估计方程(GEE)进行估计。基于GEE的方法部分地是由于缺乏针对分类数据的易于处理的模型。但是,当存在缺失数据时,此类边际方法可能无法产生完全有效的估计,也无法产生一致的估计。在论文的第一部分中,我开发了一个用于分析纵向分类数据的马尔可夫模型,该模型有助于对边际和条件结构进行建模。采用似然公式进行推论,因此得出的估计量具有诸如最佳效率和一致性之类的属性,并且在随机丢失数据时保持一致。仿真研究表明,所提出的方法在各种情况下都具有良好的性能。在吸烟预防研究的数据中的应用说明了该模型的效用和协变量效应的解释。;在实践中,我们经常会在响应和协变量中都面临缺失值的数据,有时,缺失变量之间存在某种关联响应和协变量。正确分析此类数据需要考虑这种相关性。纵向研究中损耗的影响取决于缺失的反应与缺失的协变量之间的相关性。忽略这种相关性可能会使统计推断产生偏差。我们已经研究了通过使用逆概率加权广义估计方程将响应的缺失与协变量缺失联系起来的方法。仿真表明,提出的方法产生了一个一致的估计量,而忽略关联的方法产生了一个不一致的估计量。许多纵向不完整数据的分析着重研究协变量对均值响应的影响。但是,很少有人关注解决集群纵向研究中协变量缺失对关联参数的影响。本文的最后一部分主要解决这个问题。构建加权的一阶和二阶估计方程以获得均值和关联参数的一致估计。在实践中的许多研究领域中经常出现不完整的数据。这种现象在有关受试者疾病史的纵向数据中很常见。渐进模型为表征疾病过程提供了便利的框架,例如,当状态代表受试者遭受的不可逆损害的程度时,疾病过程就会出现。如果导致丢失数据的机制与响应过程有关,则会出现问题。幼稚的分析可能会导致偏差的结果和无效的推论。本文的第二部分从对不完整观测值进行纵向研究的渐进多状态模型开始进行研究。基于EM算法执行最大似然估计,并使用路易斯方法提供方差估计。通常,当丢失数据机制完全随机丢失或随机丢失时,最大似然估计有效。在这里,我们提供了一种基于似然性的方法,因为无论丢失何种数据机制,参数都是可识别的。仿真研究表明,提出的方法在各种情况下均能很好地工作。

著录项

  • 作者

    Chen, Baojiang.;

  • 作者单位

    University of Waterloo (Canada).;

  • 授予单位 University of Waterloo (Canada).;
  • 学科 Statistics.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 224 p.
  • 总页数 224
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号