首页> 外文会议>International conference on computer and network technology >A Fast N-gram Index Construction Algorithm based on Forecast Allocation and Linking Memory Management for Large Chinese Corpus
【24h】

A Fast N-gram Index Construction Algorithm based on Forecast Allocation and Linking Memory Management for Large Chinese Corpus

机译:基于预测分配和链接记忆管理的大型中文语料库快速N-gram索引构造算法

获取原文

摘要

To improve the speed of searching and inverted index construction in information retrieval based on large Chinese corpus.N-gram index is chosen for its fast searching speed by comparing with character index and word index.A fast N-gram index construction algorithm base on forecast allocation and linking memory management is presented here to improve the speed of inverted index construction on condition of large corpus.A forecast function is used to allocate memory for inverted list according to the history data when the inverted list is full.And new allocated memory is linked to the end of original space without moving original data.Experimental results demonstrate that the new method improves memory utilization in the course of index construction,and also the time of index construction is decreased.
机译:为了提高基于大型中文语料库的信息检索和反向索引构建的速度,通过与字符索引和单词索引进行比较,选择了N-gram索引以实现快速搜索。基于预测的N-gram快速索引构建算法为了提高大语料库条件下倒排索引的构建速度,本文提出了分配和链接的内存管理方法。当倒排列表已满时,使用预测功能根据历史数据为倒排列表分配内存。实验结果表明,该新方法在建立索引的过程中提高了内存的利用率,并减少了索引的建立时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号