...
首页> 外文期刊>Ecological Applications >Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data
【24h】

Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data

机译:样本选择偏差和仅存在分布模型:对背景和伪缺失数据的影响

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Most methods for modeling species distributions from occurrence records require additional data representing the range of environmental conditions in the modeled region. These data, called background or pseudo-absence data, are usually drawn at random from the entire region, whereas occurrence collection is often spatially biased toward easily accessed areas. Since the spatial bias generally results in environmental bias, the difference between occurrence collection and background sampling may lead to inaccurate models. To correct the estimation, we propose choosing background data with the same bias as occurrence data. We investigate theoretical and practical implications of this approach. Accurate information about spatial bias is usually lacking, so explicit biased sampling of background sites may not be possible. However, it is likely that an entire target group of species observed by similar methods will share similar bias. We therefore explore the use of all occurrences within a target group as biased background data. We compare model performance using target-group background and randomly sampled background on a comprehensive collection of data for 226 species from diverse regions of the world. We find that target-group background improves average performance for all the modeling methods we consider, with the choice of background data having as large an effect on predictive performance as the choice of modeling method. The performance improvement due to target-group background is greatest when there is strong bias in the target-group presence records. Our approach applies to regression-based modeling methods that have been adapted for use with occurrence data, such as generalized linear or additive models and boosted regression trees, and to Maxent, a probability density estimation method. We argue that increased awareness of the implications of spatial bias in surveys, and possible modeling remedies, will substantially improve predictions of species distributions.
机译:从事件记录对物种分布进行建模的大多数方法都需要附加数据,这些数据代表建模区域中环境条件的范围。这些数据(称为背景数据或伪缺席数据)通常是从整个区域中随机抽取的,而事件收集通常在空间上偏向易于访问的区域。由于空间偏差通常会导致环境偏差,因此事件收集与背景采样之间的差异可能会导致模型不准确。为了校正估计,我们建议选择与发生数据具有相同偏差的背景数据。我们研究这种方法的理论和实践意义。通常缺少有关空间偏差的准确信息,因此可能无法对背景位点进行明确的偏差采样。但是,通过类似方法观察到的物种的整个目标群体可能会有相似的偏见。因此,我们探索使用目标群体中所有事件作为有偏差的背景数据。我们使用目标群体背景和随机采样的背景,对来自世界不同地区的226种物种的全面数据进行比较,比较模型的性能。我们发现目标组背景可以提高我们考虑的所有建模方法的平均性能,而背景数据的选择对预测性能的影响与建模方法的选择一样大。当目标组存在记录中存在强烈偏差时,由于目标组背景导致的性能改进最大。我们的方法适用于适用于出现数据的基于回归的建模方法,例如广义线性或加性模型和增强回归树,以及适用于概率密度估计方法的Maxent。我们认为,提高人们对空间偏见的影响的认识以及可能的建模方法,将大大改善对物种分布的预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号