Significance analysis of high-dimensional, low-sample size partially labeled data

Qiyi Lu; Xingye Qiao

首页> 外文期刊>Quality Control and Applied Statistics >Significance analysis of high-dimensional, low-sample size partially labeled data

【24h】

Significance analysis of high-dimensional, low-sample size partially labeled data

机译：高维，低样本尺寸部分标记数据的意义分析

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Purpose:To propose a method for classification/clustering based on significance analysis of high dimensional, low-sample size data where only a small portion of the class labels (partially labeled) are available.Summary:It is highlighted about the role played by classification and clustering activities in statistical learning. Testing the difference between two classes is quite challenging when there is high-dimensional, low-sample size (HDLSS) data. While there are approaches to deal with such data, the problem becomes difficult when there are observations with many not having class labels (partially labeled data). The article develops a significance testing method for the HDLSS partially labeled data. Two significance analysis methods are considered:> DiProPerm test that is applicable when all class labels are known, and> Statistical significance clustering test (SigClust) that does not require a labelA detailed review of these test methods is presented from the perspective of their application to HDLSS data and the proposed test method for the significance analysis of HDLSS partially labeled (SigPal) data is presented. Some theoretical results are studied with an emphasize on an HDLSS data setting. In order to illustrate the proposed test method, a comprehensive simulation study is considered. A real time application to breast cancer data is also studied do demonstrate the usefulness of the proposed method and the results are discussed. (41 refs.) Results:While classification and clustering activities are important tools in statistical learning, their successful application depends on the nature of data on hand. Generally, in the case of classification, class labels are provided prior to the analysis, while such labels are unavailable in the clustering analysis. Also there are situations where the high-dimensional, low-sample size (HDLSS) data need to be dealt with, and this becomes more challenging.

机译：目的：提出一种基于高维，低样本大小数据的显着性分析的分类/聚类方法，其中仅可用的一小部分标签（部分标记）。突出显示：突出显示分类的作用统计学习中的聚类活动。测试两种类之间的差异是非常具有挑战性的，当存在高维的低样本大小（HDLS）数据时。虽然存在处理此类数据的方法，但是当有许多没有具有类标签（部分标记的数据）时，问题变得困难。本文开发了HDLSS部分标记数据的重要性测试方法。考虑了两种意义分析方法：> DiproPerm测试适用，当所有类标签都是已知的，并且>不需要Labela的统计显着性聚类测试（Sigclust），从应用程序的角度来看，提出了对这些测试方法的详细审查提出了HDLSS数据和呈现了部分标记的HDLSS的重要性分析（SIGPAL）数据的显着分析的测试方法。研究了一些理论结果，并强调了HDLSS数据设置。为了说明所提出的测试方法，考虑了全面的仿真研究。还研究了乳腺癌数据的实时应用，表明了所提出的方法的有用性，并讨论了结果。（41 refs。）结果：虽然分类和聚类活动是统计学习中的重要工具，但其成功的应用程序取决于手头数据的性质。通常，在分类的情况下，在分析之前提供类标签，而这种标签在聚类分析中不可用。此外，存在需要处理高维，低样本大小（HDLS）数据的情况，并且这变得更具挑战性。

著录项

来源
《Quality Control and Applied Statistics》 |2018年第4期|共2页
作者
Qiyi Lu; Xingye Qiao;
展开▼
作者单位

Department of Mathematical Sciences Binghamton University State University of New York;

Department of Mathematical Sciences Binghamton University State University of New York;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类概率论、数理统计的应用;
关键词

相似文献

外文文献
中文文献
专利

1. Significance analysis of high-dimensional, low-sample size partially labeled data [J] . Qiyi Lu, Xingye Qiao Quality Control and Applied Statistics . 2018,第3a4期

机译：高维，低样本尺寸部分标记数据的意义分析
2. Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data [J] . Gui J, Li HZ Bioinformatics . 2005,第13期

机译：高维和低样本量设置中的惩罚性Cox回归分析，应用于微阵列基因表达数据
3. Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data [J] . Gui J, Li HZ Bioinformatics . 2005,第13期

机译：高维和低样本量设置中的惩罚性Cox回归分析，应用于微阵列基因表达数据
4. Structural Classification based Correlation and its Application to Principal Component Analysis for High-Dimension Low-Sample Size Data [C] . Mika Sato-Ilic IEEE International Conference on Fuzzy Systems . 2012

机译：基于结构分类的相关性及其在高维层低样本数据的主成分分析中的应用
5. Integrated Latent Construct Partially Linear Predictive Models with Applications to Multi-Group Study and High-Dimensional Data [D] . Yang, Lei . 2020

机译：集成潜在构造部分线性预测模型，具有多组研究和高维数据的应用
6. Nonparametric relevance-shifted multiple testing procedures for the analysis of high-dimensional multivariate data with small sample sizes [O] . Cornelia Frömke, Ludwig A Hothorn, Siegfried Kropf 2008

机译：非参数相关移位的多重测试程序用于分析小样本量的高维多元数据
7. Significance Analysis of High-Dimensional, Low-Sample Size Partially Labeled Data [O] . Lu, Qiyi, Qiao, Xingye 2015

机译：部分高维，低样本尺寸的显着性分析标记数据
8. Literature Survey on Drop Size Data, Measuring Equipment and Discussion of the Significance of Drop Size in Fire Extinguishment [R] . Hayes, W. D. 1985

机译：作者：张莹莹，王莹，王莹，王莹，中国造船sHIpBUILD

Significance analysis of high-dimensional, low-sample size partially labeled data

摘要

著录项

相似文献

相关主题

期刊订阅