...
首页> 外文期刊>Ecology: A Publication of the Ecological Society of America >Cross-validation of species distribution models: Removing spatial sorting bias and calibration with a null model
【24h】

Cross-validation of species distribution models: Removing spatial sorting bias and calibration with a null model

机译:物种分布模型的交叉验证:消除空间分类偏差并使用空模型进行校准

获取原文
获取原文并翻译 | 示例

摘要

Species distribution models are usually evaluated with cross-validation. In this procedure evaluation statistics are computed from model predictions for sites of presence and absence that were not used to train (fit) the model. Using data for 226 species, from six regions, and two species distribution modeling algorithms (Bioclim and MaxEnt), I show that this procedure is highly sensitive to "spatial sorting bias": the difference between the geographic distance from testing-presence to training-presence sites and the geographic distance from testing-absence (or testing-background) to training-presence sites. I propose the use of pairwise distance sampling to remove this bias, and the use of a null model that only considers the geographic distance to training sites to calibrate cross-validation results for remaining bias. Model evaluation results (AUC) were strongly inflated: the null model performed better than MaxEnt for 45% and better than Bioclim for 67% of the species. Spatial sorting bias and area under the receiver-operator curve (AUC) values increased when using partitioned presence data and random-absence data instead of independently obtained presence-absence testing data from systematic surveys. Pairwise distance sampling removed spatial sorting bias, yielding null models with an AUC close to 0.5, such that AUC was the same as null model calibrated AUC (cAUC). This adjustment strongly decreased AUC values and changed the ranking among species. Cross-validation results for different species are only comparable after removal of spatial sorting bias and/or calibration with an appropriate null model.
机译:物种分布模型通常通过交叉验证进行评估。在此过程中,评估统计数据是根据模型预测得出的,用于存在和不存在的位置的数据并未用于训练(拟合)模型。利用来自六个地区的226种物种的数据以及两种物种分布建模算法(Bioclim和MaxEnt),我证明了该程序对“空间排序偏差”高度敏感:从测试状态到训练状态的地理距离之间的差异存在站点以及从测试不在位(或测试背景)到培训存在站点的地理距离。我建议使用成对距离采样来消除此偏差,并建议使用空模型,该模型仅考虑到训练站点的地理距离来校准交叉验证结果以保留偏差。模型评估结果(AUC)被夸大了:零模型的表现优于MaxEnt(占45%),优于Bioclim(占67%)。当使用分区存在数据和随机缺勤数据而不是从系统调查中独立获得的存在缺勤测试数据时,空间分类偏差和接收者-运营者曲线(AUC)值下的面积会增加。逐对距离采样消除了空间排序偏差,从而产生了AUC接近0.5的空模型,因此AUC与空模型校准的AUC(cAUC)相同。这种调整大大降低了AUC值并改变了物种之间的排名。仅在去除空间排序偏差和/或使用适当的空模型进行校准之后,不同物种的交叉验证结果才具有可比性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号