首页> 美国卫生研究院文献>Environmental Health Perspectives >A simulation study of confounding in generalized linear models for air pollution epidemiology.
【2h】

A simulation study of confounding in generalized linear models for air pollution epidemiology.

机译:空气污染流行病学广义线性模型混杂的模拟研究。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Confounding between the model covariates and causal variables (which may or may not be included as model covariates) is a well-known problem in regression models used in air pollution epidemiology. This problem is usually acknowledged but hardly ever investigated, especially in the context of generalized linear models. Using synthetic data sets, the present study shows how model overfit, underfit, and misfit in the presence of correlated causal variables in a Poisson regression model affect the estimated coefficients of the covariates and their confidence levels. The study also shows how this effect changes with the ranges of the covariates and the sample size. There is qualitative agreement between these study results and the corresponding expressions in the large-sample limit for the ordinary linear models. Confounding of covariates in an overfitted model (with covariates encompassing more than just the causal variables) does not bias the estimated coefficients but reduces their significance. The effect of model underfit (with some causal variables excluded as covariates) or misfit (with covariates encompassing only noncausal variables), on the other hand, leads to not only erroneous estimated coefficients, but a misguided confidence, represented by large t-values, that the estimated coefficients are significant. The results of this study indicate that models which use only one or two air quality variables, such as particulate matter [less than and equal to] 10 microm and sulfur dioxide, are probably unreliable, and that models containing several correlated and toxic or potentially toxic air quality variables should also be investigated in order to minimize the situation of model underfit or misfit.
机译:模型协变量和因果变量(可能包含或可能不包含在模型协变量中)之间的混淆是空气污染流行病学中使用的回归模型中的一个众所周知的问题。这个问题通常是公认的,但几乎没有得到研究,特别是在广义线性模型的情况下。使用合成数据集,本研究显示了在泊松回归模型中存在相关因果变量的情况下模型的过拟合,欠拟合和不拟合如何影响协变量的估计系数及其置信度。该研究还显示了这种影响如何随协变量范围和样本量的变化而变化。这些研究结果与普通线性模型在大样本范围内的相应表达式之间存在定性一致性。过度拟合模型中协变量的混杂(协变量不仅包含因果变量),不会使估计系数产生偏差,但会降低其显着性。另一方面,模型拟合不足(某些因果变量被排除为协变量)或失配(其中协变量仅包含非因变量)的影响不仅会导致错误的估算系数,而且还会导致误导的置信度(由大t值表示),估计的系数很重要。这项研究的结果表明,仅使用一个或两个空气质量变量(例如小于或等于10微米的颗粒物和二氧化硫)的模型可能是不可靠的,并且该模型包含多个相关且有毒或潜在有毒的物质还应研究空气质量变量,以最大程度地减少模型不匹配或不匹配的情况。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号