...
首页> 外文期刊>Ecological Modelling >Species-specific tuning increases robustness to sampling bias in models of species distributions: An implementation with Maxent
【24h】

Species-specific tuning increases robustness to sampling bias in models of species distributions: An implementation with Maxent

机译:特定于物种的调整提高了物种分布模型中采样偏差的鲁棒性:Maxent的实现

获取原文
获取原文并翻译 | 示例

摘要

Various methods exist to model a species' niche and geographic distribution using environmental data for the study region and occurrence localities documenting the species' presence (typically from museums and herbaria). In presence-only modelling, geographic sampling bias and small sample sizes represent challenges for many species. Overfitting to the bias and/or noise characteristic of such datasets can seriously compromise model generality and transferability, which are critical to many current applications - including studies of invasive species, the effects of climatic change, and niche evolution. Even when transferability is not necessary, applications to many areas, including conservation biology, macroecology, and zoonotic diseases, require models that are not overfit. We evaluated these issues using a maximum entropy approach (Maxent) for the shrew Cryptotis meridensis, which is endemic to the Cordillera de Mérida in Venezuela. To simulate strong sampling bias, we divided localities into two datasets: those from a portion of the species' range that has seen high sampling effort (for model calibration) and those from other areas of the species' range, where less sampling has occurred (for model evaluation). Before modelling, we assessed the climatic values of localities in the two datasets to determine whether any environmental bias accompanies the geographic bias. Then, to identify optimal levels of model complexity (and minimize overfitting), we made models and tuned model settings, comparing performance with that achieved using default settings. We randomly selected localities for model calibration (sets of 5, 10, 15, and 20 localities) and varied the level of model complexity considered (linear versus both linear and quadratic features) and two aspects of the strength of protection against overfitting (regularization). Environmental bias indeed corresponded to the geographic bias between datasets, with differences in median and observed range (minima and/or maxima) for some variables. Model performance varied greatly according to the level of regularization. Intermediate regularization consistently led to the best models, with decreased performance at low and generally at high regularization. Optimal levels of regularization differed between sample-size-dependent and sample-size-independent approaches, but both reached similar levels of maximal performance. In several cases, the optimal regularization value was different from (usually higher than) the default one. Models calibrated with both linear and quadratic features outperformed those made with just linear features. Results were remarkably consistent across the examined sample sizes. Models made with few and biased localities achieved high predictive ability when appropriate regularization was employed and optimal model complexity was identified. Species-specific tuning of model settings can have great benefits over the use of default settings.
机译:存在使用研究区域和记录该物种存在的发生地点的环境数据(通常来自博物馆和草本植物)的环境数据对物种的生态位和地理分布建模的各种方法。在仅存在的建模中,地理采样偏差和小样本量代表了许多物种的挑战。过度拟合此类数据集的偏差和/或噪声特征可能会严重损害模型的通用性和可移植性,这对于许多当前应用(包括对入侵物种,气候变化影响和生态位演化的研究)都至关重要。即使没有必要的可移植性,在许多领域(包括保护生物学,宏观生态学和人畜共患病)的应用,也都需要不过度拟合的模型。我们使用最大熵方法(Maxent)对me的隐孢子虫(Cryptotis meridensis)进行了评估,这是委内瑞拉梅里达山脉的地方病。为了模拟强烈的抽样偏见,我们将地点分为两个数据集:来自物种范围内一部分的区域,该区域的采样工作量很大(用于模型校准),以及来自物种范围内其他区域的区域,这些区域发生了较少的采样(用于模型评估)。在进行建模之前,我们评估了两个数据集中的局部气候值,以确定是否有任何环境偏见伴随着地理偏见。然后,为了确定最佳的模型复杂性水平(并最大程度地减少过度拟合),我们制作了模型并调整了模型设置,将性能与使用默认设置获得的性能进行了比较。我们随机选择了用于模型校准的位置(5个,10个,15个和20个位置集),并改变了所考虑的模型复杂性级别(线性,线性和二次特征)以及防止过度拟合的强度的两个方面(正则化) 。环境偏差确实对应于数据集之间的地理偏差,其中某些变量的中位数和观测范围(最小值和/或最大值)有所不同。根据正则化的级别,模型的性能差异很大。中间正则化始终导致最佳模型,低正则化和高正则化时性能下降。样本大小相关和样本大小独立方法之间的最佳正则化级别有所不同,但是两者均达到了相似的最大性能水平。在某些情况下,最佳正则化值与默认值不同(通常高于默认值)。使用线性和二次特征校准的模型优于仅使用线性特征进行校准的模型。在所检查的样本量之间,结果非常一致。当采用适当的正则化并确定最佳模型复杂度时,很少有局部偏见的模型可以获得较高的预测能力。特定于物种的模型设置调整与使用默认设置相比具有很大的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号