首页> 外文学位 >Efficient declustering and indexing techniques for temporal databases and information retrieval.
【24h】

Efficient declustering and indexing techniques for temporal databases and information retrieval.

机译:用于时态数据库和信息检索的高效解聚和索引技术。

获取原文
获取原文并翻译 | 示例

摘要

This work focuses on declustering data to improve query performance, since the I/O becomes a bottleneck in databases and information retrieval systems with huge amounts of data. The bottleneck is not new, but it is becoming more and more apparent. Therefore, we investigate techniques that can be used for such declustering, that is, for distributing the data on different disks depending on the probability of their being retrieved together in the same query. The architecture assumed is that of a single processor, with multiple disks to store the data, from which the data can be accessed in parallel. We also investigate access structures that can be used to store data in such a way that the boolean queries are optimized. The declustering techniques that we propose give better performance than the traditionally used techniques like random or round robin. We propose several techniques, viz., T-proximity and KT-proximity, which are suitable for temporal databases, and set intersection-based, multiset intersection-based, vector, euclidean, as well as a proximity technique for information retrieval systems. The access structures that we propose for optimizing boolean queries give a response time that is orders of magnitude lower than the traditional way of treating a boolean query as multiple queries of each of its literals, and then merging the results obtained for those queries to obtain the final result.
机译:这项工作的重点是对数据进行分块以提高查询性能,因为I / O成为具有大量数据的数据库和信息检索系统的瓶颈。瓶颈并不新鲜,但是它变得越来越明显。因此,我们研究了可用于此类分簇的技术,即,根据在同一查询中一起检索到它们的可能性,将数据分布在不同的磁盘上。假定的体系结构是单个处理器的体系结构,具有多个用于存储数据的磁盘,可以从中并行访问数据。我们还将研究可用于优化布尔查询的方式用于存储数据的访问结构。我们提出的解簇技术比传统使用的技术(如随机或循环法)具有更好的性能。我们提出了几种适用于时间数据库的技术,即T-proximity和KT-proximity,并针对信息检索系统设置了基于交集,基于多集交集,向量,欧几里得的近似技术。我们提出的用于优化布尔查询的访问结构所提供的响应时间比将布尔查询视为其每个文字的多个查询的传统方式要短几个数量级,然后合并为这些查询获得的结果以获取布尔值。最后结果。

著录项

  • 作者

    Behl, Sanjiv.;

  • 作者单位

    University of Houston.;

  • 授予单位 University of Houston.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2002
  • 页码 99 p.
  • 总页数 99
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号