首页> 外文期刊>Journal of Chemometrics >DISCRIMINANT ANALYSIS WITH SINGULAR COVARIANCE MATRICES. A METHOD INCORPORATING CROSS-VALIDATION AND EFFICIENT RANDOMIZED PERMUTATION TESTS
【24h】

DISCRIMINANT ANALYSIS WITH SINGULAR COVARIANCE MATRICES. A METHOD INCORPORATING CROSS-VALIDATION AND EFFICIENT RANDOMIZED PERMUTATION TESTS

机译:奇异方差矩阵的判别分析。包含交叉验证和有效随机化渗透测试的方法

获取原文
获取原文并翻译 | 示例
       

摘要

A computationally efficient approach has been developed to perform two-group linear discriminant analysis using high-dimensional data. The analysis is based on Fisher's method and incorporates two important validation stages: 1, full leave-one-observation-out cross-validation; 2, randomized permutation distribution testing. The resulting algorithm and software are known as CREDIT (cross-validated random-permutation-tested efficient discrimination based on an adjusted generalized inverse for the sample total covariance matrix). The algorithm has been implemented in the SAS/IML matrix programming language and provides dramatic improvements in computational efficiency compared with existing software for discriminant analysis incorporating validation stages 1 and 2 above. Application of CREDIT to nine multivariate data sets indicates that the predictive performance of the approach, assessed using cross-validation, is comparable with that of other methods for discriminant analysis. Comparisons with two specific methods are included. Randomized permutation tests show that success rates using the true response classes are almost always better than success rates using random permutations of the classes. This gives confidence that there is a useful linear discriminant relationship present in the data being analysed. For a randomly selected training set (used to construct the discriminant rule) the success rates for CREDIT are unbiased predictive success rates for, allocating other observations to groups. Predicting group memberships for future observations using any discriminant model based on singular estimates of covariance matrices must be performed with great care. A discussion of methods to test the concordance of future observations with the training set is given.
机译:已经开发出一种计算有效的方法来使用高维数据执行两组线性判别分析。该分析基于费舍尔的方法,并包含两个重要的验证阶段:1,全面的“一留一观察”交叉验证; 2,随机排列分布测试。所得的算法和软件称为CREDIT(基于样本总协方差矩阵的经过调整的广义逆,经过交叉验证的随机置换测试的有效判别)。该算法已用SAS / IML矩阵编程语言实现,并且与包含上述验证阶段1和2的用于判别分析的现有软件相比,在计算效率上有显着提高。将CREDIT应用于9个多元数据集表明,使用交叉验证评估的该方法的预测性能可与其他判别分析方法相媲美。包括与两种特定方法的比较。随机排列测试显示,使用真实响应类别的成功率几乎总是比使用类别的随机排列的成功率更好。这使我们有信心在要分析的数据中存在有用的线性判别关系。对于随机选择的训练集(用于构造判别规则),CREDIT的成功率是无偏见的预测成功率,将其他观察值分配给各个组。使用基于协方差矩阵奇异估计的任何判别模型来预测未来观察的组成员身份时,必须格外小心。讨论了测试未来观察与训练集的一致性的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号