首页> 外国专利> For documents to be indexed and searched for in a database, a computer-implemented method, and information retrieval system

For documents to be indexed and searched for in a database, a computer-implemented method, and information retrieval system

机译:对于要在数据库中索引和搜索的文档,一种计算机实现的方法和信息检索系统

摘要

An information retrieval system stores and retrieves documents using particles and a particle-based language model. A set of particles for a collection of documents in a particular language is constructed from training documents such that a perplexity of the particle-based language model is substantially lower than the perplexity of a word-based language model constructed from the same training documents. The documents can then be converted to document particle graphs from which particle-based keys are extracted to form an index to the documents. Users can then retrieve relevant documents using queries also in the form of particle graphs.
机译:信息检索系统使用粒子和基于粒子的语言模型来存储和检索文档。从训练文档构造用于特定语言的文档集合的一组粒子,使得基于粒子的语言模型的困惑度显着低于从相同训练文档构建的基于单词的语言模型的困惑度。然后可以将文档转换为文档粒子图,从中提取基于粒子的关键字以形成文档的索引。然后,用户还可以使用查询以粒子图的形式检索相关文档。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号