首页> 外文学位 >Identifying patterns in behavioral public health data using mixture modeling with an informative number of repeated measures.
【24h】

Identifying patterns in behavioral public health data using mixture modeling with an informative number of repeated measures.

机译:使用混合模型和大量重复措施来识别行为公共卫生数据中的模式。

获取原文
获取原文并翻译 | 示例

摘要

Finite mixture modeling is a useful statistical technique for clustering individuals based on patterns of responses. The fundamental idea of the mixture modeling approach is to assume there are latent clusters of individuals in the population which each generate their own distinct distribution of observations (multivariate or univariate) which are then mixed up together in the full population. Hence, the name mixture comes from the fact that what we observe is a mixture of distributions. The goal of this model-based clustering technique is to identify what the mixture of distributions is so that, given a particular response pattern, individuals can be clustered accordingly. Commonly, finite mixture models, as well as the special case of latent class analysis, are used on data that inherently involve repeated measures. The purpose of this dissertation is to extend the finite mixture model to allow for the number of repeated measures to be incorporated and contribute to the clustering of individuals rather than measures. The dimension of the repeated measures or simply the count of responses is assumed to follow a truncated Poisson distribution and this information can be incorporated into what we call a dimension informative finite mixture model (DIMM). The outline of this dissertation is as follows. Paper 1 is entitled, "Dimension Informative Mixture Modeling (DIMM) for questionnaire data with an informative number of repeated measures." This paper describes the type of data structures considered and introduces the dimension informative mixture model (DIMM). A simulation study is performed to examine how well the DIMM fits the known specified truth. In the first scenario, we specify a mixture of three univariate normal distributions with different means and similar variances with different and similar counts of repeated measurements. We found that the DIMM predicts the true underlying class membership better than the traditional finite mixture model using a predicted value metric score. In the second scenario, we specify a mixture of two univariate normal distributions with the same means and variances with different and similar counts of repeated measurements. We found that that the count-informative finite mixture model predicts the truth much better than the non-informative finite mixture model. Paper 2 is entitled, "Patterns of Physical Activity in the Northern Manhattan Study (NOMAS) Using Multivariate Finite Mixture Modeling (MFMM)." This is a study that applies a multivariate finite mixture modeling approach to examining and elucidating underlying latent clusters of different physical activity profiles based on four dimensions: total frequency of activities, average duration per activity, total energy expenditure and the total count of the number of different activities conducted. We found a five cluster solution to describe the complex patterns of physical activity levels, as measured by fifteen different physical activity items, among a US based elderly cohort. Adding in a class of individuals who were not doing any physical activity, the labels of these six clusters are: no exercise, very inactive, somewhat inactive, slightly under guidelines, meet guidelines and above guidelines. This methodology improves upon previous work which utilized only the total metabolic equivalent (a proxy of energy expenditure) to classify individuals into inactive, active and highly active. Paper 3 is entitled, "Complex Drug Use Patterns and Associated HIV Transmission Risk Behaviors in an Internet Sample of US Men Who Have Sex With Men." This is a study that applies the count-informative information into a latent class analysis on nineteen binary drug items of drugs consumed within the past year before a sexual encounter. In addition to the individual drugs used, the mixture model incorporated a count of the total number of drugs used. We found a six class solution: low drug use, some recreational drug use, nitrite inhalants (poppers) with prescription erectile dysfunction (ED) drug use, poppers with prescription/non-prescription ED drug use and high polydrug use. Compared to participants in the low drug use class, participants in the highest drug use class were 5.5 times more likely to report unprotected anal intercourse (UAI) in their last sexual encounter and approximately 4 times more likely to report a new sexually transmitted infection (STI) in the past year. Younger men were also less likely to report UAI than older men but more likely to report an STI.
机译:有限混合建模是一种有用的统计技术,可用于根据响应模式对个体进行聚类。混合建模方法的基本思想是假设人口中存在潜在的个体集群,每个集群都生成各自不同的观测值分布(多变量或单变量),然后在整个总体中混合在一起。因此,混合名称这个事实是因为我们观察到的是分布的混合。这种基于模型的聚类技术的目标是识别分布的混合情况,以便在给定特定的响应模式的情况下,可以对个体进行聚类。通常,有限混合模型以及潜在类别分析的特殊情况用于固有地涉及重复测量的数据。本文的目的是扩展有限混合模型,以允许重复测量的数量被纳入并有助于个体而不是测量的聚类。假定重复测量的维数或仅响应的数量遵循截断的Poisson分布,并且该信息可以合并到我们称为维信息的有限混合模型(DIMM)中。本文的概述如下。论文1的标题为“用于问卷数据的维度信息混合模型(DIMM),其中包含大量重复测量信息。”本文介绍了所考虑的数据结构类型,并介绍了尺寸信息混合模型(DIMM)。进行了仿真研究,以检查DIMM是否适合已知的指定事实。在第一种情况下,我们指定了三个单变量正态分布的混合物,它们具有不同的均值和相似的方差以及重复测量的不同和相似计数。我们发现,使用预测值度量值分数,DIMM可以比传统的有限混合模型更好地预测真正的基础类成员。在第二种情况下,我们指定两个单变量正态分布的混合物,它们具有相同的均值和方差,且重复测量的计数不同且相似。我们发现,计数信息有限混合模型比非信息有限混合模型更好地预测了事实。论文2的标题是“使用多元有限混合模型(MFMM)在曼哈顿北部研究(NOMAS)中进行体育锻炼的方式”。这项研究采用多元有限混合建模方法,基于以下四个维度来检查和阐明不同身体活动分布的潜在潜伏团:活动的总频率,每项活动的平均持续时间,总的能量消耗和总数的总数进行了不同的活动。我们发现了一个五类解决方案来描述体育活动水平的复杂模式,该模式由美国一个老年人群中的十五个不同的体育活动项目来衡量。再加上一类不进行任何体育活动的个人,这六个类别的标签为:不运动,非常不活跃,有些不活跃,稍微符合指导原则,符合指导原则和以上指导原则。该方法改进了以前的工作,该工作仅利用总代谢当量(能量消耗的代理)将个体分类为不活跃,活跃和高度活跃。论文3的标题为“在互联网上与男性发生性关系的美国男性样本中的复杂药物使用模式和相关的HIV传播风险行为”。这项研究将计数信息应用于潜在性分析中,对过去一年发生性关系之前消费的19种二元药物进行了分类。除了使用的个别药物外,混合物模型还包含了所使用药物总数的计数。我们发现了六类解决方案:低毒品使用,某些娱乐性毒品使用,具有处方勃起功能障碍(ED)药物使用的亚硝酸盐吸入剂(poppers),具有处方/非处方ED药物使用的poppers和高多元药物使用。与低吸毒级别的参与者相比,高吸毒级别的参与者在上一次性接触中报告无保护的肛门性交(UAI)的可能性高5.5倍,报告新的性传播感染(STI)的可能性高约4倍。 )。与年长的男性相比,年轻的男性报告UAI的可能性更小,但更容易报告性传播感染。

著录项

  • 作者

    Yu, Gary.;

  • 作者单位

    Columbia University.;

  • 授予单位 Columbia University.;
  • 学科 Biology Biostatistics.;Health Sciences Public Health.;Health Sciences Epidemiology.
  • 学位 Dr.P.H.
  • 年度 2014
  • 页码 112 p.
  • 总页数 112
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号