首页> 外文期刊>Statistical Analysis and Data Mining >Sparse linear discriminant analysis in structured covariates space
【24h】

Sparse linear discriminant analysis in structured covariates space

机译:结构化协变量空间中的稀疏线性判别分析

获取原文
       

摘要

Classification with high‐dimensional variables is a popular goal in many modern statistical studies. Fisher's linear discriminant analysis (LDA) is a common and effective tool for classifying entities into existing groups. It is well known that classification using Fisher's discriminant for high‐dimensional data is as bad as random guessing because of the use of many noise features, which increases the misclassification rate. Recently, it is being acknowledged that complex biological mechanisms occur through multiple features working together, though individually these features may contribute to noise accumulation in the data. In view of these, it is important to perform classification with discriminant vectors that use a subset of important variables, while also utilizing prior biological relationships among features. We tackle this problem in this paper and propose methods that incorporate variable selection into the classification problem for the identification of important biomarkers. Furthermore, we incorporate into the LDA problem prior information on the relationships among variables using undirected graphs in order to identify functionally meaningful biomarkers. We compare our methods with existing sparse LDA approaches via simulation studies and real data analysis.
机译:具有高维变量的分类是许多现代统计研究中的流行目标。 Fisher的线性判别分析(LDA)是一个常见而有效的工具,用于将实体分类为现有组。众所周知,由于使用许多噪声功能,使用Fisher对高维数据判别的分类与随机猜测一样糟糕,这增加了错误分类率。最近,据认识到,通过在一起的多个功能发生复杂的生物学机制,但是单独地这些特征可能有助于数据中的噪声积累。鉴于这些,重要的是使用使用重要变量的子集的判别载体进行分类,同时还利用特征之间的先前生物关系。我们在本文中解决了这个问题,并提出了将变量选择结合到分类问题的方法,以确定重要的生物标志物。此外,我们将LDA问题纳入LDA问题,以前使用过向图的变量之间的关系,以识别功能有意义的生物标志物。我们通过仿真研究和实际数据分析将使用现有稀疏LDA方法进行比较。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号