首页> 外文会议>HomeNet.TV Plus >Declustering large multidimensional data sets for range queries over heterogeneous disks
【24h】

Declustering large multidimensional data sets for range queries over heterogeneous disks

机译:整理大型多维数据集以用于异构磁盘上的范围查询

获取原文
获取原文并翻译 | 示例

摘要

Declustering is a technique to distribute data sets over multiple disks so that future retrievals can be well balanced over the disks and be performed in parallel. Although clusters often have heterogeneous disks, most declustering work has focused only on homogeneous environments. In this work, we investigate the declustering problem for a heterogeneous disk environment using virtual servers, and propose approaches for deciding the number of virtual servers and the mapping between virtual servers and physical disks. Our experimental results show that by combining our algorithm for choosing the number of virtual servers with a greedy algorithm for mapping virtual servers to disks, users can expect range query retrieval performance within 4% of the optimum achievable in practice on average, in all configurations studied. Compared to an intuitively natural approach to the problem, this represents an improvement of 8-31% in average fetch ratio, as well a 26-38% reduction in the standard deviation of performance for small queries.
机译:群集是一种将数据集分布在多个磁盘上的技术,以便将来的检索可以在磁盘上得到很好的平衡,并可以并行执行。尽管群集通常具有异构磁盘,但是大多数群集工作仅集中在同类环境上。在这项工作中,我们调查使用虚拟服务器的异构磁盘环境的分簇问题,并提出确定虚拟服务器数量以及虚拟服务器和物理磁盘之间映射的方法。我们的实验结果表明,通过将我们选择虚拟服务器数量的算法与用于将虚拟服务器映射到磁盘的贪婪算法相结合,在所有研究的配置中,用户可以期望范围查询检索性能平均在实践中可实现的最佳平均值的4%之内。与直观地自然解决该问题的方法相比,这表示平均提取率提高了8-31%,对于小查询,性能的标准偏差降低了26-38%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号