首页> 外国专利> System and method for multithreaded text indexing for next generation multi-core architectures

System and method for multithreaded text indexing for next generation multi-core architectures

机译:用于下一代多核体系结构的多线程文本索引系统和方法

摘要

A system and method for indexing documents in a data storage system includes generating a single document hash table in storage memory for a single document using an index construction in a multithreaded and scalable configuration wherein multiple threads are each assigned work to reduce synchronization between threads. The single document hash table includes partitioning the single document and indexing strings of partitioned portions of the single document to create a minor hash table for each document sub-part; generating a document level hash table from the minor hash tables; updating a stream level hash table for the strings which maps every string to a global identifier; and generating a term reordered array from the document level hash table.
机译:一种用于在数据存储系统中为文档建立索引的系统和方法,包括在多线程和可伸缩配置中使用索引构造为单个文档在存储存储器中生成单个文档哈希表,其中,每个线程均被分配工作以减少线程之间的同步。单个文档哈希表包括对单个文档进行分区和为单个文档的分区部分建立索引字符串,以为每个文档子部分创建次要哈希表;从次要哈希表生成文档级哈希表;更新字符串的流级别哈希表,以将每个字符串映射到全局标识符;并从文档级哈希表生成术语重排数组。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号