首页> 外文期刊>Technometrics >MacroPCA: An All-in-One PCA Method Allowing for Missing Values as Well as Cellwise and Rowwise Outliers
【24h】

MacroPCA: An All-in-One PCA Method Allowing for Missing Values as Well as Cellwise and Rowwise Outliers

机译:macropca:一体化的PCA方法,允许缺少值以及蜂窝旋和划线异常值

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Multivariate data are typically represented by a rectangular matrix (table) in which the rows are the objects (cases) and the columns are the variables (measurements). When there are many variables one often reduces the dimension by principal component analysis (PCA), which in its basic form is not robust to outliers. Much research has focused on handling rowwise outliers, that is, rows that deviate from the majority of the rows in the data (e.g., they might belong to a different population). In recent years also cellwise outliers are receiving attention. These are suspicious cells (entries) that can occur anywhere in the table. Even a relatively small proportion of outlying cells can contaminate over half the rows, which causes rowwise robust methods to break down. In this article, a new PCA method is constructed which combines the strengths of two existing robust methods to be robust against both cellwise and rowwise outliers. At the same time, the algorithm can cope with missing values. As of yet it is the only PCA method that can deal with all three problems simultaneously. Its name MacroPCA stands for PCA allowing for Missingness And Cellwise & Rowwise Outliers. Several simulations and real datasets illustrate its robustness. New residual maps are introduced, which help to determine which variables are responsible for the outlying behavior. The method is well-suited for online process control.
机译:多变量数据通常由矩形矩阵(表)表示,其中行是对象(情况),列是变量(测量)。当有许多变量时,一个经常通过主成分分析(PCA)减少维度,其基本形式是对异常值不稳定的。许多研究专注于处理曲线的异常值,即偏离数据中大多数行的行(例如,它们可能属于不同的人口)。近年来,蜂蜜异常值也受到关注。这些是可疑的细胞(条目),可以在表中的任何位置发生。即使是相对较小的偏远细胞也可以污染一半的行,这会导致秩序的鲁棒方法分解。在本文中,构造了一种新的PCA方法,该方法将两种现有的强大方法的强度与蜂窝刀涡卷和划线异常值相结合。同时,算法可以应对缺失的值。据虽然它是唯一可以同时处理所有三个问题的PCA方法。它的名称Macropca代表PCA,允许缺失和蜂窝手机和划线异常值。几个模拟和实际数据集说明了其稳健性。介绍了新的剩余地图,这有助于确定哪些变量对外围行为负责。该方法非常适合在线过程控制。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号