首页> 外文会议>IEEE International Conference on Data Mining >Fast Distance Metrics in Low-Dimensional Space for Neighbor Search Problems
【24h】

Fast Distance Metrics in Low-Dimensional Space for Neighbor Search Problems

机译:用于邻居搜索问题的低维空间的快速距离度量

获取原文

摘要

We consider popular dimension reduction techniques that project data on a low dimensional subspace. They include Principal Component Analysis, Column Subset Selection, and Johnson-Lindenstrauss projections. These techniques have been classically used to efficiently compute various approximations. We propose the following three-step procedure for enhancing the accuracy of such approximations: 1. Unknown quantities in the approximation are replaced with random variables. 2. The Maximum Entropy method is applied to infer the most likely probability distribution. 3. Expected values of the random variables are used to compute the enhanced estimates. Our use of the Maximum Entropy method requires knowledge of vector norms that can be easily computed during the dimension reduction. We demonstrate significant enhancements in average accuracy for Euclidean distance and Mahalanobis distance, and improvements in evaluating k-nearest neighbors and k-furthest neighbors by using the enhanced Euclidean distance formula.
机译:我们认为流行的降维技术,在低维子空间的项目数据。它们包括主成分分析法,柱子集选择,和约翰逊Lindenstrauss预测。这些技术已经被传统使用的有效计算各种近似。我们提出了加强这种近似的准确性以下三个步骤:1.未知量在近似被替换为随机变量。 2.最大熵方法应用于推断最可能的概率分布。 3.预期随机变量的值来计算增强估计。我们的最大熵方法的使用需要的矢量规范,可以降维过程中容易地计算知识。我们证明在通过使用增强的欧几里德距离式评估k-最近邻和k-最远的邻居的平均精度欧几里德距离和马哈拉诺比斯距离显著增强和改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号