首页> 外文学位 >Statistics in ecological modeling; presence -only data and boosted MARS
【24h】

Statistics in ecological modeling; presence -only data and boosted MARS

机译:生态建模统计;仅存在数据和增强的MARS

获取原文
获取原文并翻译 | 示例

摘要

The research presented in this thesis was primarily motivated by current problems in ecological modeling. One such major problem is the so-called presence-only problem, the analysis of which forms the core of this thesis. The remainder of the thesis introduces boosted MARS, a new flexible modeling procedure.;In ecological modeling of the habitat of a species, it can be prohibitively expensive to determine species absence. Presence-only data consists of a sample of locations with known presences and a separate group of locations sampled from the population, with unknown presences. We propose four different approaches, all of which estimate an underlying presence-absence model for presence-only data. In the first, an application of the EM algorithm can be used with almost any off-the-shelf logistic model. In the other three models, gradient boosted regression trees are used to maximize different loss functions arising from different model assumptions. Simulations and analyses based on sampling from presence-absence records of fish in New Zealand rivers illustrates that these new procedures can reduce both deviance and the shrinkage of marginal effect estimates that occur in the naive model often used in practice. It is shown that the population prevalence of a species is only identifiable when there is some unrealistic constraint on the structure of the logistic model. In practice, it is strongly recommended that an estimate of population prevalence be provided.;Boosted MARS is a modeling procedure which aims to mimic the flexibility and performance of gradient boosted trees, but while providing a smoother fitted model. Refitting the boosted MARS basis functions using L1 regularization sometimes improves predictive performance. However, boosted regression trees usually outperform either boosted MARS mdoel. We also investigate the effects of smoothing and thresholding the MARS basis functions on predictive and descriptive performance.
机译:本文提出的研究主要是受到当前生态建模问题的推动。这样的主要问题之一就是所谓的“仅存在问题”,其分析构成了本文的核心。本文的其余部分介绍了增强的MARS(一种新的灵活建模程序)。在对物种栖息地进行生态建模时,确定物种的缺失会非常昂贵。仅存在数据包括具有已知存在的位置的样本和从总体中采样的具有未知存在的位置的另一组。我们提出了四种不同的方法,所有这些方法都为仅存在数据估计了潜在的存在模型。首先,几乎所有现成的物流模型都可以使用EM算法。在其他三个模型中,使用梯度增强回归树来最大化由不同模型假设引起的不同损失函数。基于对新西兰河流鱼类的不存在记录进行采样的模拟和分析表明,这些新程序既可以减少偏差,也可以减少在实践中经常使用的朴素模型中所产生的边际影响估计值。结果表明,只有对逻辑模型的结构存在一些不切实际的约束时,才能确定物种的种群流行率。在实践中,强烈建议提供人口患病率的估计值。Boosted MARS是一种建模过程,旨在模仿梯度增强树木的灵活性和性能,但同时提供更平滑的拟合模型。使用L1正则化重新拟合增强的MARS基函数有时可以提高预测性能。但是,增强型回归树通常胜过增强型MARS mdoel。我们还研究了对MARS基函数进行平滑和阈值化对预测和描述性能的影响。

著录项

  • 作者

    Ward, Gillian.;

  • 作者单位

    Stanford University.;

  • 授予单位 Stanford University.;
  • 学科 Statistics.
  • 学位 Ph.D.
  • 年度 2007
  • 页码 114 p.
  • 总页数 114
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号