首页> 外文会议>International Conference on Algorithms and Architectures for Parallel Processing >Distributed parallel generation of indices for very large text databases
【24h】

Distributed parallel generation of indices for very large text databases

机译:用于非常大的文本数据库的分布式并行生成索引

获取原文
获取外文期刊封面目录资料

摘要

We propose a new algorithm for the parallel generation of suffix arrays for large text databases on high-bandwidth computer networks. Suffix arrays are structures used in full text indexing which support very powerful query languages. Our algorithm is based on a parallel indirect mergesort (it is not a simple mergesort procedure) and is compared with a well known sequential algorithm (which is very efficient running on a single machine). Although network-bounded, the parallel version is theoretically and experimentally a much better alternative when compared to the sequential version (which is I/O-bounded in disk).
机译:我们提出了一种新的算法,用于高带宽计算机网络上的大型文本数据库的后缀阵列的并行生成。后缀阵列是用于全文索引中使用的结构,支持非常强大的查询语言。我们的算法基于并行间接合并(它不是简单的合并程序),并与众所周知的顺序算法(在单个机器上非常有效地运行)进行比较。虽然网络界限,并行版本是理论上,并在与顺序版本相比(磁盘中的I / O界)相比的更好的替代方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号