首页> 外文OA文献 >Three studies in numerical methods for statistical approximations.
【2h】

Three studies in numerical methods for statistical approximations.

机译:统计近似的数值方法的三项研究。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

An explosive advance of numerical analysis techniques in recent years has paralleled the rapid increase and accessibility of computing power. This is not a coincidence. Many techniques that previously had been theoretical are now able to be applied. The most well-known of these is probably the Metropolis-Hastings algorithm which, although conceived in 1953, has only recently become practically applicable. Numerical methods are required because it is not always possible to derive explicit probabilistic models and analytically compute their associated estimators. Two major classes of numerical problems that arise in statistical inference are optimisation problems and integration problems. Optimisation problems involve determining the 'best' solutions to mathematically defined problems. Integration problems involve obtaining a numerical approximation of an integral, for cases when the integral cannot be found explicitly. Optimisation is generally associated with the likelihood approach and integration with the Bayesian approach, although these are not strict classifications. Bootstrap methods, for example, are concerned with the integration of marginal distributions, but are not Bayesian methods. The statistical techniques that we shall be primarily concerned with are Bayesian methods and the inferences that can be drawn from their use. The approaches we shall focus on are customarily associated with integration problems. In two of the three parts of this thesis we shall focus on the fact that continuous statistical models are always only approximations for measurement processes that are necessarily discrete. Numerical integration procedures provide almost unlimited scope for realistic statistical modelling. Until recently, acknowledging the full complexity and structure in many statistical problems was difficult, and often resulted in the development of specific methodology and purpose-built software. The alternative was to formulate the problem in the, often over-simple, framework of an available method. Modern integration techniques provide a unifying framework within which many complex problems can be analysed using standard computer programs. Recent numerical developments have unified researchers in all branches of applied statistics. Because traditional methods of analysis are not readily adaptable to all settings, researchers in individual disciplines have often developed original approaches for model fitting that are customised for their own problems. For example, the Metropolis-Hastings algorithm originated in the field of mechanical physics. This Thesis is split into three main Chapters; each is concerned with some branch of numerical approximation. Chapter 2 considers how to calculate the probability that the sum of the product of variables assessed with a Normal distribution is negative. The analysis was motivated by a specific problem in electrical engineering. To resolve the problem, two distinct steps are required. First, we consider ways in which we can assess the distribution for the product of two Normally distributed variables. Three different methods are compared: a numerical methods approximation, which involves implementing a numerical integration procedure on MATLAB, a Monte Carlo construction and an approximation to the analytic result using the Normal distribution. The second step considers how to assess the distribution for the sum of the products of two Normally distributed variables by applying the Convolution Formula. To conclude Chapter 2 the two steps are combined to compute the distribution for the sum of products of Normally distributed variables, and thus to calculate the probability that this sum of products is negative. The problem is also approached directly, using a Monte Carlo approximation. Chapter 3 investigates how well continuous conjugate theory can approximate real discrete mass functions in various measurement settings. All statistical measurements which represent the values of useful unknown quantities have a realm that is both finite and discrete. Thus our uncertainties about any measurement can be represented by discrete probability mass functions. Nonetheless, common statistical practice treats probability distributions as representable by continuous densities or mixture densities. Many statistical problems involve the analysis of sequences of observations that the researcher regards exchange ably. Often we wish to find a joint probability mass function over X1, X2 , .. . ,Xn , with interim interest in the sequence of updated probability mass functions f (xi+1 | Xi = xi) for i = 1,2, ... ,n - 1. We examine how well digital Normal mass functions and digital parametric mixtures are approximated by continuous Mixture Normal and Normal-Gamma Mixture Normal distributions for such items as E(Xi+1 | Xi = xi) and V (Xi+1 | Xi = xi). Digital mass functions are generated by specifying a finite realm of measurements for a quantity of interest, finding a density value of some specified function at each point, and then normalising the densities over the realm to generate mass values. Both a digitised prior mixing mass function and digitised information transfer function are generated and used, via Bayes' Theorem, to compute posterior mass functions. Approximating posterior densities using continuous conjugate theory are evaluated, and the two sets of results compared. The main achievement of this Chapter is to formalise a computing strategy that can be applied to many functional forms. An example is provided in the next Chapter. In Chapter 4 different approaches to flood frequency analysis are considered, with particular emphasis on estimating extreme hydrological events for a site, or group of sites. Flood risk has been the topic of a considerable number of publications over the last twenty years, yet there is still no consensus on how best to proceed. The problem is complicated by the need to estimate flood risk for return periods that exceed the length of observed record. Consequently much research has focused on methods emphasising data pooling. Chapter 4 begins with an examination of different frequentist approaches to flood estimation. We study at-site and regional estimates, and compare their accuracy and precision. Next, we assess flood exceedance quantiles using updated mixture mass functions as sequential forecasting distributions. These sequential forecasts are scored using three different scoring rules for distributions: the quadratic, logarithmic and spherical. The digital updating procedure is based on the work developed in Chapter 3. Both the frequentist methods and the digital forecasting procedures are applied to data collected from the Waimakariri River in Canterbury, New Zealand. We complete the Chapter by comparing the appropriateness of the frequentist and digital methods. It is found that the mixture distributions computed via the discrete digital method provide much more uniform forecasts across an array of proposed distribution families than do the frequentist forecasting methods. Before proceeding to the main body of work, we shall briefly introduce three different categories of numerical integration algorithms: non-Monte Carlo methods, non-iterative Monte Carlo methods and iterative Monte Carlo methods. Finally, the application of different methods of numerical approximation to the work contained in this Thesis will be discussed. Numerical integration algorithms approximate the generation of random variables from a posterior distribution when this distribution cannot be directly computed. Non-Monte Carlo methods of numerical integration consist of algorithms based on Simpson's method. They do not require the input of a stream of (pseudo) random numbers. Whereas algorithms based on Simpson's method evaluate a function for a sequence of equally spaced points, Monte Carlo methods are types of numerical integration based on repeated simulations. Non-iterative Monte Carlo methods, also known as traditional Monte Carlo methods, are algorithms that require a stream of (pseudo) random numbers as input and produce a sample from the posterior density as output. Examples include importance sampling and acceptreject methods. Iterative Monte Carlo methods, or Markov chain Monte Carlo (MCMC) methods, are algorithms that require a random input stream and also require iteration to realise a sample from the posterior distribution of interest. Examples include the Gibbs sampling algorithm and the Metropolis-Hastings (M-H) algorithm. Before we consider the differences between these three categories, we shall briefly introduce the Bayesian paradigm, illustrating the vital role of integration.
机译:近年来,数值分析技术的爆炸性发展与计算能力的快速增长和可访问性并驾齐驱。这不是巧合。现在可以应用许多以前是理论上的技术。其中最著名的可能是Metropolis-Hastings算法,尽管该算法于1953年提出,但直到最近才真正应用。数值方法是必需的,因为并非总是能够得出明确的概率模型并通过分析来计算其关联的估计量。统计推断中出现的两大类数字问题是优化问题和积分问题。优化问题涉及确定数学定义问题的“最佳”解决方案。对于无法明确找到积分的情况,积分问题涉及获得积分的数值近似值。优化通常与似然方法以及与贝叶斯方法的集成相关联,尽管这些并不是严格的分类。例如,Bootstrap方法与边际分布的积分有关,而与贝叶斯方法无关。我们将主要关注的统计技术是贝叶斯方法以及可以从其使用中得出的推论。我们将重点关注的方法通常与集成问题相关联。在本文的三个部分中的两个部分中,我们将关注以下事实:连续统计模型始终仅是对必然离散的测量过程的近似。数值积分程序为实际的统计建模提供了几乎无限的范围。直到最近,要确认许多统计问题的全部复杂性和结构还很困难,并且常常导致开发特定的方法和专用软件。另一种选择是在一个通常过于简单的可用方法框架中提出问题。现代集成技术提供了一个统一的框架,在其中可以使用标准计算机程序分析许多复杂的问题。最近的数值发展使应用统计的所有分支中的研究人员变得统一。由于传统的分析方法并不容易适应所有环境,因此各个学科的研究人员经常开发出针对模型拟合的原始方法,这些方法针对自己的问题进行了定制。例如,Metropolis-Hastings算法起源于机械物理学领域。本论文分为三个主要章节。每个都与数值逼近的某个分支有关。第2章考虑如何计算以正态分布评估的变量乘积之和为负的概率。该分析是由电气工程中的一个特定问题引起的。要解决此问题,需要两个不同的步骤。首先,我们考虑评估两个正态分布变量乘积的分布的方法。比较了三种不同的方法:数值方法逼近,其中包括在MATLAB上实施数值积分过程,蒙特卡洛构造和使用正态分布的解析结果逼近。第二步考虑如何通过应用卷积公式来评估两个正态分布变量的乘积之和的分布。总结第二章,将两个步骤结合起来,计算正态分布变量的乘积之和的分布,从而计算出该乘积之和为负的概率。还可以使用蒙特卡洛近似直接解决该问题。第3章研究了连续共轭理论在各种测量设置下如何逼近实际离散质量函数。代表有用未知量值的所有统计量度都具有有限和离散的领域。因此,我们对任何测量的不确定性可以用离散概率质量函数表示。尽管如此,普通的统计实践将概率分布视为可以由连续密度或混合密度表示。许多统计问题涉及研究人员认为可以交换的观测序列的分析。通常,我们希望找到X1,X2,...上的联合概率质量函数。 ,Xn,对i = 1,2,...,n-1的更新概率质量函数f(xi + 1 | Xi = xi)的序列具有临时兴趣。我们研究了数字正态质量函数和数字参数的好坏对于诸如E(Xi + 1 | Xi = xi)和V(Xi + 1 | Xi = xi)的项,通过连续的混合正态和正态伽玛混合正态分布来近似混合。通过为感兴趣的量指定有限的测量范围来生成数字质量函数,找到每个点上某些指定函数的密度值,然后对领域中的密度进行归一化以生成质量值。数字化的先验混合质量函数和数字化的信息传递函数都可以生成,并通过贝叶斯定理用于计算后验质量函数。使用连续共轭理论评估后验密度,并比较两组结果。本章的主要成就是确定了一种可应用于许多功能形式的计算策略。下一章将提供一个示例。在第4章中,考虑了不同的洪水频率分析方法,尤其着重于估计一个地点或一组地点的极端水文事件。在过去的二十年中,洪水风险一直是许多出版物的主题,但是对于如何最好地进行洪水仍未达成共识。由于需要估计超过观测记录长度的退水期洪水风险,使问题变得复杂。因此,许多研究都集中在强调数据池化的方法上。第4章从研究洪水估算的不同常用方法开始。我们研究现场和区域估计,并比较其准确性和精确度。接下来,我们使用更新的混合质量函数作为顺序的预测分布来评估洪水超过分位数。这些连续的预测使用三种不同的评分规则对得分进行评分:二次,对数和球形。数字更新程序基于第3章中的工作。频繁性方法和数字预测程序都适用于从新西兰坎特伯雷的怀玛卡里里河收集的数据。我们通过比较常客和数字方法的适当性来完成本章。可以发现,通过离散数字方法计算出的混合分布在一系列提议的分布族中提供的预测要比频繁性预测方法提供的均匀得多。在开始本文的主体之前,我们将简要介绍三种不同类型的数值积分算法:非蒙特卡洛方法,非迭代蒙特卡洛方法和迭代蒙特卡洛方法。最后,将讨论不同的数值逼近方法在本论文中的应用。当无法直接计算后验分布时,数值积分算法会根据其后分布来估计随机变量的生成。数值积分的非蒙特卡洛方法由基于辛普森方法的算法组成。它们不需要输入(伪)随机数流。基于辛普森方法的算法对等距点序列的函数进行评估,而蒙特卡洛方法是基于重复模拟的数值积分类型。非迭代蒙特卡洛方法,也称为传统蒙特卡洛方法,是需要(伪)随机数流作为输入并从后验密度产生样本作为输出的算法。示例包括重要性抽样和接受拒绝方法。迭代蒙特卡洛方法或马尔可夫链蒙特卡洛(MCMC)方法是需要随机输入流并且还需要迭代以从感兴趣的后验分布实现样本的算法。示例包括Gibbs采样算法和Metropolis-Hastings(M-H)算法。在考虑这三类之间的差异之前,我们将简要介绍贝叶斯范式,说明整合的重要作用。

著录项

  • 作者

    Ware Robert Stuart;

  • 作者单位
  • 年度 2003
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号