【24h】

Adaptive Metric Dimensionality Reduction

机译:自适应度量降维

获取原文

摘要

We study data-adaptive dimensionality reduction in the context of supervised learning in general metric spaces. Our main statistical contribution is a generalization bound for Lipschitz functions in metric spaces that are doubling, or nearly doubling, which yields a new theoretical explanation for empirically reported improvements gained by preprocessing Euclidean data by PCA (Principal Components Analysis) prior to constructing a linear classifier. On the algorithmic front, we describe an analogue of PCA for metric spaces, namely an efficient procedure that approximates the data's intrinsic dimension, which is often much lower than the ambient dimension. Our approach thus leverages the dual benefits of low dimensionality: (1) more efficient algorithms, e.g., for proximity search, and (2) more optimistic generalization bounds.
机译:我们在一般度量空间的监督学习中研究数据自适应降维。我们的主要统计贡献是对倍增或几乎倍增的度量空间中Lipschitz函数的泛化界,这为经验报告的改进提供了新的理论解释,这些改进是通过构造线性分类器之前通过PCA(主成分分析)对欧几里得数据进行预处理而获得的。在算法方面,我们描述了度量空间的PCA类似物,即一种有效的过程,它近似于数据的固有维数,该维数通常比环境维数要低得多。因此,我们的方法利用了低维的双重好处:(1)更有效的算法,例如用于邻近搜索,以及(2)更乐观的泛化界限。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号