首页> 外文会议>International Conference on Data Engineering >High dimensional similarity joins: algorithms and performance evaluation
【24h】

High dimensional similarity joins: algorithms and performance evaluation

机译:高维相似性连接:算法和性能评估

获取原文
获取外文期刊封面目录资料

摘要

Current data repositories include a variety of data types, including audio, images and time series. State of the art techniques for indexing such data and doing query processing rely on a transformation of data elements into points in a multidimensional feature space. Indexing and query processing then take place in the feature space. We study algorithms for finding relationships among points in multidimensional feature spaces, specifically algorithms for multidimensional joins. Like joins of conventional relations, correlations between multidimensional feature spaces can offer valuable information about the data sets involved. We present several algorithmic paradigms for solving the multidimensional join problem, and we discuss their features and limitations. We propose a generalization of the Size Separation Spatial Join algorithm, named Multidimensional Spatial Join (MSJ), to solve the multidimensional join problem. We evaluate MSJ along with several other specific algorithms, comparing their performance for various dimensionalities on both real and synthetic multidimensional data sets. Our experimental results indicate that MSJ, which is based on space filling curves, consistently yields good performance across a wide range of dimensionalities.
机译:当前数据存储库包括各种数据类型,包括音频,图像和时间序列。用于索引此类数据并进行查询处理的最先进的技术依赖于数据元素的转换为多维特征空间中的点。索引和查询处理然后在特征空间中进行。我们研究了用于在多维特征空间中的点中找到关系的算法,专门用于多维连接的算法。与传统关系的连接一样,多维特征空间之间的相关性可以提供有关所涉及的数据集的有价值的信息。我们提出了多种算法范例来解决多维连接问题,我们讨论其特征和限制。我们提出了大小分离空间连接算法的概括,命名为多维空间连接(MSJ),以解决多维连接问题。我们评估MSJ以及其他几种特定算法,比较它们在实际和合成多维数据集上的各种维度的性能。我们的实验结果表明,基于空间填充曲线的MSJ一致地在各种尺寸范围内产生良好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号