首页> 外文学位 >Optimal algorithms for L1-norm Principal Component Analysis: New tools for signal processing and machine learning with few and/or faulty training data.
【24h】

Optimal algorithms for L1-norm Principal Component Analysis: New tools for signal processing and machine learning with few and/or faulty training data.

机译:L1-norm主成分分析的最佳算法:信号处理和机器学习的新工具,培训数据很少和/或有问题。

获取原文
获取原文并翻译 | 示例

摘要

The objective of this work is the development of a solid theoretical and algorithmic framework for outlier-resistant L1-norm Principal Component Analysis (PCA). PCA is the statistical data analysis technique that has been, for over a century, the "mainstay" of modern signal processing and machine learning, with numerous important applications in wireless communications, computer networks, computer vision, image processing, and bio-informatics, to name a few. However, researchers have long observed that standard L2-norm-based PCA is highly responsive to corrupted, highly deviating, irregular data-points (outliers) in the data, even when they appear in a vanishingly small numbers. Because of the frequent appearance of such outliers in real-world applications and the sensitivity of L2-norm Principal Components (PCs), a great amount of research effort has been placed in the past few decades on calculating and using instead PCs that define L1-norm maximum-projection data subspaces (L1-PCA).;A summary of our contributions in this manuscript follows. In Chapter 1, we translate L1-PCA into combinatorial optimization and deliver the first two optimal algorithms in the literature for its exact calculation. In Chapter 2, we propose a third, efficient L 1-PCA algorithm (complexity close to that of standard L 2-PCA) that attains optimal performance with empirical probability close to 1, outperforming on average all counterparts of comparable computational cost existing in the literature. This algorithm was designed to bridge the gap between our optimal algorithms of high computational cost and the existing low-cost suboptimal algorithms of high performance degradation and, thus, be the method of choice for real-world L1-PCA of large data. In Chapter 3, we focus on the special case of real, non-negative matrices (e.g., images, graph adjacency matrices) and calculate their optimal L1-PC with linear cost. Then, we present a novel L1-PCA-based technique for the recovery of an image from a set of few, possibly severely corrupted copies. In Chapter 4, we employ our L1-PCA tools from Chapters 1 and 2 for developing a state-of-the-art subspace-based direction-of-arrival (DoA) estimation method, that is capable of attaining performance similar to that of the highly popular L2-subspace methods in nominal system operation, while exhibiting inherent resistance against unknown data record contamination. In Chapter 5, we steer our focus from real to complex data analysis and define, for the first time, the L1-PC of complex data. Then, we present an algorithm for calculating the L 1-PC of a complex data matrix and use it to devise a novel outlier-resistant subspace-based DoA estimation method. Finally, in Chapter 6, we establish the MMSE operation for Pseudonoise (PN) masked data in the form of a time varying linear filter, suggest an implementation that avoids repeated input autocorrelation matrix inversion, and develop an auxiliary-vector (AV) MMSE filter estimator with state-of-the-art short-data-record estimation performance.
机译:这项工作的目的是为抗离群值L1范数主成分分析(PCA)建立坚实的理论和算法框架。 PCA是一种统计数据分析技术,在一个多世纪以来一直是现代信号处理和机器学习的“支柱”,在无线通信,计算机网络,计算机视觉,图像处理和生物信息学等众多重要应用中,仅举几例。但是,研究人员长期以来一直观察到,基于标准L2规范的PCA对数据​​中已损坏,高度偏离,不规则的数据点(异常值)具有很高的响应能力,即使它们出现的数量很少。由于此类异常值在实际应用中经常出现,并且对L2规范主成分(PC)敏感,因此在过去的几十年中,人们进行了大量的研究工作来计算和使用定义L1的PC规范最大投影数据子空间(L1-PCA)。我们在本手稿中的贡献摘要如下。在第1章中,我们将L1-PCA转换为组合优化,并在文献中提供了前两个最优算法以进行精确计算。在第2章中,我们提出了第三种高效的L 1-PCA算法(复杂度接近于标准L 2-PCA),其经验概率接近1时可获得最佳性能,其性能平均优于文献。设计该算法的目的是弥合我们的高计算成本的最佳算法与现有的高性能降级的低成本次优算法之间的差距,从而成为现实世界中大数据L1-PCA的选择方法。在第3章中,我们重点介绍实数非负矩阵(例如图像,图邻接矩阵)的特殊情况,并以线性成本计算它们的最佳L1-PC。然后,我们提出了一种基于L1-PCA的新颖技术,用于从少数几个可能严重损坏的副本中恢复图像。在第4章中,我们使用第1章和第2章中的L1-PCA工具来开发最新的基于子空间的到达方向(DoA)估计方法,该方法能够达到与在标称系统操作中非常流行的L2子空间方法,同时表现出对未知数据记录污染的固有抵抗力。在第5章中,我们将重点从真实数据分析转向复杂数据分析,并首次定义了复杂数据的L1-PC。然后,我们提出了一种用于计算复杂数据矩阵的L 1-PC的算法,并将其用于设计一种新颖的基于离群值抗子空间的DoA估计方法。最后,在第6章中,我们以时变线性滤波器的形式建立了伪噪声(PN)掩蔽数据的MMSE操作,提出了避免重复输入自相关矩阵求逆的实现,并开发了辅助矢量(AV)MMSE滤波器具有最新的短数据记录估计性能的估计器。

著录项

  • 作者

    Markopoulos, Panagiotis.;

  • 作者单位

    State University of New York at Buffalo.;

  • 授予单位 State University of New York at Buffalo.;
  • 学科 Electrical engineering.
  • 学位 Ph.D.
  • 年度 2015
  • 页码 163 p.
  • 总页数 163
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号