首页> 外文会议>Uncertainty in Artificial Intelligence >Sufficient Dimensionality Reduction with Irrelevance Statistics
【24h】

Sufficient Dimensionality Reduction with Irrelevance Statistics

机译:具有不相关统计量的充分降维

获取原文

摘要

The problem of unsupervised dimensionality reduction of stochastic variables while preserving their most relevant characteristics is fundamental for the analysis of complex data. Unfortunately, this problem is ill defined since natural datasets inherently contain alternative underlying structures. In this paper we address this problem by extending the recently introduced "Sufficient Dimensionality Reduction" feature extraction method [7], to use "side information" about irrelevant structures in the data. The use of such irrelevance information was recently successfully demonstrated in the context of clustering via the Information Bottleneck method [1]. Here we use this side-information framework to identify continuous features whose measurements are maximally informative for the main data set, but carry as little information as possible on the irrelevance data set. In statistical terms this can be understood as extracting statistics which are maximally sufficient for the main dataset, while simultaneously maximally ancillary for the irrelevance dataset. We formulate this problem as a tradeoff optimization problem and describe its analytic and algorithmic solutions. Our method is demonstrated on a synthetic example and on a real world application of face images, showing its superiority over other methods such as Oriented Principal Component Analysis.
机译:在保留随机变量最相关特征的同时无监督降维的问题是分析复杂数据的基础。不幸的是,由于自然数据集固有地包含替代的基础结构,因此该问题定义不明确。在本文中,我们通过扩展最近引入的“降维充分”特征提取方法[7]来解决此问题,以使用有关数据中无关结构的“辅助信息”。最近,在通过信息瓶颈方法[1]进行聚类的情况下,成功地证明了这种不相关信息的使用。在这里,我们使用这种辅助信息框架来识别连续特征,这些特征的度量值对主数据集最大程度地提供了信息,但在无关数据集上则携带了尽可能少的信息。用统计术语来说,这可以理解为提取对于主数据集最大足够的统计信息,同时对于不相关数据集最大地辅助统计信息。我们将此问题公式化为权衡优化问题,并描述其解析和算法解决方案。我们的方法在一个合成示例和人脸图像的实际应用中得到了证明,显示了它比其他方法(例如定向主成分分析)优越的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号