One of the goals of spatial epidemiology is to identify areas with elevated disease risk. Such analyses are often hampered by the limited geographical resolution of the available data. When data are aggregated into spatial units, conditional autoregressive (CAR) models are commonly used. When data are available at higher resolution (e.g. geocodes), log-Gaussian Cox processes (LGCPs) provide a more natural modelling framework. In theory, LGCPs should perform better, but do they? We simulated data mimicking childhood leukaemia incidence in the Canton of Zurich in Switzerland (n=334 during 1985-2015). Geocoded locations of residence were available for the entire population. We randomly sampled case locations from these data under different risk scenarios. We considered 39 scenarios varying the shape of the true risk function (constant, step-wise, exponential decay), size of the high-risk areas (1, 5 and 10 km radii), risk increase in the high-risk areas (2 and 5-fold) and the number of cases (n, 5n and 10n). We compared the ability of the models to recover the true risk surface using the root mean integrated squared error (RMISE) and their ability to identify high-risk areas using area under the ROC curve (AUC). CAR models recovered the step-wise true risk surface with lower error across all scenarios (range of median RMISE across scenarios: 0.05-0.25) compared to LGCPs (median RMISE: 1.80-37.2). For exponential decay risk surfaces, however, LGCPs performed better (median RMISE: 1.70-20) compared to CAR (median RMISE: 1.80-32) in almost all scenarios. The ability to detect high-risk areas was higher for LGCPs (median AUC: 0.81-1) compared to the CAR model (median AUC: 0.65-0.93) across almost all scenarios. Our simulation study suggests that, under realistic scenarios, continuous domain models outperform discrete domain models in estimating risk surfaces and identifying high-risk areas. This argues for moving towards continuous domain models in spatial epidemiology.
展开▼