首页> 外文学位 >Investigation of topics in U-statistics and their applications in risk estimation and cross-validation

【24h】

Investigation of topics in U-statistics and their applications in risk estimation and cross-validation

机译：U统计中的主题调查及其在风险估计和交叉验证中的应用

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The primary goal of my dissertation has been to develop new methods, including theory and practical implementation, in the area of U-statistics. This area is quite old, with many important results first appearing in Hoeffding (1948). There have been many applications of U-statistics in nonparametric statistics. One area that is quite modern and active is cross-validation and risk estimation, although it has not traditionally been thought of as a U-statistic area. The application of my research has been focused on this area.;The first objective of my research is to devise the best unbiased variance estimator for a general U-statistic. It can be written as a quadratic form of the kernel function and is applicable as long as the kernel size k ≤ n/2. In addition, it can be represented as a familiar ANOVA form as a contrast of between-class and within-class variation. As a further step to make the proposed variance estimator more practical, we developed a partition resampling scheme that can be used to realize the U-statistic and its variance estimator simultaneously with high computational efficiency.;We then turn our attention to the implementation of U-statistics in risk estimation in the context of the nonparametric kernel density estimator. We propose to construct a U-statistic form estimate for the risk that arises from L2 and Kullback-Leibler distance respectively. In addition, we consider a two-stage, "subsampling+extrapolation", bandwidth selection procedure which can help to reduce the variability of the conventional cross-validation bandwidth selector dramatically. It is equivalent to Hall and Robinson's (2009) [27] rescaled "bagging cross-validation" bandwidth selector if one sets the fictional sample size equal to the bootstrap size. However, the simple form for our U-statistic risk estimator enables us to calculate the aggregated risk much more efficiently than bootstrapping. Moreover, a real data example in the context of model selection is considered. We construct a U-statistic cross-validation tool, akin to the BIC criterion for model selection. The U-estimator for the likelihood risk is more generally applicable than the AIC and BIC methods. In addition, with our proposed variance estimator for a general U-statistic we can test which model has the smallest risk. Finally, we will explore extrapolation and interpolation techniques with applications in bandwidth selection, variance estimation, and quantile estimation. Some preliminary results will be discussed in the end of the dissertation.

机译：本文的主要目标是在U统计领域开发新的方法，包括理论和实践方法。这个地区相当古老，许多重要成果首先出现在霍夫丁（1948）。 U统计量在非参数统计量中有许多应用。交叉验证和风险评估是一个非常现代且活跃的领域，尽管传统上并未将其视为U统计领域。我的研究的应用一直集中在这一领域。我的研究的第一个目标是为一般U统计量设计最佳无偏方差估计量。它可以写为核函数的二次形式，并且只要核大小k≤n / 2即可适用。此外，它可以表示为熟悉的方差分析形式，作为类间差异和类内变异的对比。为了使所提出的方差估计器更实用，我们进一步开发了一种分区重采样方案，该方案可用于以高计算效率同时实现U统计量及其方差估计器。然后，我们将注意力转向U的实现非参数内核密度估计器的风险估计中的统计信息。我们建议针对分别由L2和Kullback-Leibler距离引起的风险构建U统计形式的估计。此外，我们考虑了一个两阶段的“子采样+外推”带宽选择过程，该过程可以帮助显着降低常规交叉验证带宽选择器的可变性。如果有人将虚拟样本大小设置为等于引导大小，则它等效于Hall and Robinson（2009）[27]重新缩放的“装袋交叉验证”带宽选择器。但是，U统计风险估计器的简单形式使我们能够比自举法更有效地计算汇总风险。此外，考虑了模型选择方面的实际数据示例。我们构建了一个类似于BIC标准进行模型选择的U统计交叉验证工具。与AIC和BIC方法相比，适用于可能性风险的U估计值更普遍。另外，使用我们针对一般U统计量提出的方差估计器，我们可以测试哪个模型具有最小的风险。最后，我们将探讨外推和内插技术及其在带宽选择，方差估计和分位数估计中的应用。本文的最后将讨论一些初步结果。

著录项

作者
Wang, Qing.;
展开▼
作者单位

The Pennsylvania State University.;

展开▼
授予单位 The Pennsylvania State University.;
学科 Statistics.;Applied mathematics.
学位 Ph.D.
年度 2012
页码 188 p.
总页数 188
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. VARIANCE ESTIMATION OF A GENERAL U-STATISTIC WITH APPLICATION TO CROSS-VALIDATION [J] . Qing Wang, Bruce Lindsay Statistica Sinica . 2014,第3期

机译：一般U统计量的方差估计及其在交叉验证中的应用
2. Cross-Validation of Neural Network Applications for Automatic New Topic Identification [J] . H. Cenk Ozmutlu, Fatih Cavdur, Seda Ozmutlu Journal of the American Society for Information Science and Technology . 2008,第3期

机译：用于新主题自动识别的神经网络应用交叉验证
3. Robust modifications of U-statistics and applications to covariance estimation problems [J] . Minsker Stanislav, Wei Xiaohan Bernoulli: official journal of the Bernoulli Society for Mathematical Statistics and Probability . 2020,第1期

机译：强大修改U形统计和应用于协方差估计问题
4. Maximal Deviations of Incomplete U-statistics with Applications to Empirical Risk Sampling [C] . Stephan Clemencon, Sylvain Robbiano, Jessica Tressou SIAM International Conference on Data Mining . 2013

机译：不完整U形统计数据的最大偏差与申请到经验风险抽样
5. Topics on channel estimation and equalization for sparse channels with applications to digital TV systems. [D] . Ozen, Serdar. 2003

机译：稀疏频道的频道估计和均衡的主题及其在数字电视系统中的应用。
6. Accurate risk estimation of β-amyloid positivity to identify prodromal Alzheimers disease: Cross-validation study of practical algorithms [O] . Sebastian Palmqvist, Philip S. Insel, Henrik Zetterberg, -1

机译：准确估计β-淀粉样蛋白阳性的风险以鉴定前驱性阿尔茨海默氏病：实用算法的交叉验证研究
7. The bootstrap and cross-validation in neuroimaging applications: Estimation of the distribution of extrema of random fields for single volume tests, with an application to ADC maps [O] . Roberto Viviani, Petra Beschoner, Tina Jaeckle, 2007

机译：NeuroImaging应用中的引导和交叉验证：单卷测试的随机字段极值分布的估计，应用于ADC地图
8. Proceedings of the U.S. Nuclear Regulatory Commission Water Reactor Safety Information Meeting (15th) Held at Gaithersburg, Maryland on October 26-29, 1987. Volume 1. Plenary Sessions, Reactor Licensing Topics, NUREG-1150, Risk Analysis/PRA Applications, [R] . Weiss, A. J. 1988

机译：美国核管理委员会水反应堆安全信息会议（第15次）会议记录于1987年10月26日至29日在马里兰州盖瑟斯堡举行。第1卷。全体会议，反应堆许可主题，NUREG-1150，风险分析/ pRa应用，

Investigation of topics in U-statistics and their applications in risk estimation and cross-validation

摘要

著录项

相似文献

相关主题

期刊订阅