【24h】

Adaptive Metric Dimensionality Reduction

机译:自适应度量维度减少

获取原文

摘要

We study data-adaptive dimensionality reduction in the context of supervised learning in general metric spaces. Our main statistical contribution is a generalization bound for Lipschitz functions in metric spaces that are doubling, or nearly doubling, which yields a new theoretical explanation for empirically reported improvements gained by preprocessing Euclidean data by PCA (Principal Components Analysis) prior to constructing a linear classifier. On the algorithmic front, we describe an analogue of PCA for metric spaces, namely an efficient procedure that approximates the data's intrinsic dimension, which is often much lower than the ambient dimension. Our approach thus leverages the dual benefits of low dimensionality: (1) more efficient algorithms, e.g., for proximity search, and (2) more optimistic generalization bounds.
机译:我们研究了一般公制空间中监督学习的背景下的数据适应性维度降低。我们的主要统计贡献是在公制空间中的Lipschitz函数的泛化界定,这些空间在倍增,或几乎加倍,这产生了通过在构建线性分类器之前通过PCA(主成分分析)预处理欧几里德数据而获得的经验报告的改进的新理论解释。在算法前面,我们描述了用于度量空间的PCA的模拟,即近似数据的内在尺寸的有效过程,其通常远低于环境维度。因此,我们的方法利用低维度的双重效益:(1)更有效的算法,例如,用于接近搜索​​,并且(2)更乐观的泛化界限。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号