首页> 外文期刊>Sequential analysis >Two-Stage Procedures for High-Dimensional Data
【24h】

Two-Stage Procedures for High-Dimensional Data

机译:高维数据的两阶段过程

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

In this article, we consider a variety of inference problems for high-dimensional data. The purpose of this article is to suggest directions for future research and possible solutions about p> n problems by using new types of two-stage estimation methodologies. This is the first attempt to apply sequential analysis to high-dimensional statistical inference ensuring prespecified accuracy. We offer the sample size determination for inference problems by creating new types of multivariate two-stage procedures. To develop theory and methodologies, the most important and basic idea is the asymptotic normality when p→oo. By developing asymptotic normality when p→∞, we first give (a) a given-bandwidth confidence region for the square loss. In addition, we give (b) a two-sample test to assure prespecified size and power simultaneously together with (c) an equality-test procedure for two covariance matrices. We also give (d) a two-stage discriminant procedure that controls misclassification rates being no more than a prespecified value. Moreover, we propose (e) a two-stage variable selection procedure that provides screening of variables in the first stage and selects a significant set of associated variables from among a set of candidate variables in the second stage. Following the variable selection procedure, we consider (f) variable selection for high-dimensional regression to compare favorably with the lasso in terms of the assurance of accuracy and the computational cost. Further, we consider variable selection for classification and propose (g) a two-stage discriminant procedure after screening some variables. Finally, we consider (h) pathway analysis for high-dimensional data by constructing a multiple test of correlation coefficients.
机译:在本文中,我们考虑了高维数据的各种推理问题。本文旨在通过使用新型的两阶段估计方法,为p> n问题提供未来研究的方向和可能的解决方案。这是将顺序分析应用于高维统计推断以确保预先指定的准确性的首次尝试。通过创建新型的多变量两阶段程序,我们为推理问题提供了样本量确定方法。为了发展理论和方法,最重要和基本的思想是p→oo时的渐近正态性。通过建立p→∞时的渐近正态性,我们首先给出(a)平方损耗的给定带宽置信区域。此外,我们提供(b)两个样本的测试以确保预先指定的大小和功效,同时(c)提供两个协方差矩阵的相等性测试程序。我们还给出了(d)两阶段的判别程序,该程序将误分类率控制为不超过预定值。此外,我们提出(e)两阶段变量选择程序,该程序在第一阶段提供变量的筛选,并在第二阶段从一组候选变量中选择一组重要的关联变量。遵循变量选择程序,我们认为(f)高维回归的变量选择在保证准确性和计算成本方面可以与套索进行比较。此外,我们考虑选择变量进行分类,并在筛选出一些变量后提出(g)两阶段判别程序。最后,我们通过构建相关系数的多重检验来考虑(h)高维数据的路径分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号