首页> 外国专利> A method for storing bibliometric information on items from a finite source of text, and in particular document postings for use in a full-text document retrieval system

A method for storing bibliometric information on items from a finite source of text, and in particular document postings for use in a full-text document retrieval system

机译:一种用于存储来自有限文本源的项目的文献计量信息的方法,尤其是用于全文文档检索系统中的文档过帐

摘要

A method to compress, store, and retrieve bibliometric information on multiple sources of text is presented. The compression consists of 2 parts, and may use any one of the many ordering-based bibliometric laws for sources of text. The first compression part comprises of the storage of bibliometric information on the items from a text source, using the rank of the items in the ordering relation as defined in the bibliometric law as an indication of the bibliometric information. The second compression part efficiently uses pointers and tables to get rid of redundant information. As an application, a posting compression method is presented for use in term weighting retrieval systems. The first compression uses a postulated rank-occurrence frequency relation for the document in question that has as only variable the document's length, for example Zipf's law that states that the product of rank and frequency is approximately constant. The second compression part efficiently uses pointers and a few tables next to the principal storage. The compression makes use of direct random addressability. All postings relating to a particular document may be stored together, allowing easy expendability and updating. With respect to conventional technology, storage requirements is roughly halved.
机译:提出了一种在多个文本源上压缩,存储和检索文献计量信息的方法。压缩由两部分组成,并且可以使用许多基于排序的文献计量法则中的任何一种作为文本源。第一压缩部分包括存储来自文本源的关于项目的文献计量信息,使用在文献计量法中定义的排序关系中的项目的等级作为文献计量信息的指示。第二压缩部分有效地使用指针和表来摆脱冗余信息。作为一种应用,提出了一种用于术语加权检索系统的过帐压缩方法。第一次压缩对所讨论的文档使用假定的等级出现频率关系,该关系仅具有可变的文档长度,例如Zipf定律,该等级表示等级和频率的乘积近似恒定。第二个压缩部分有效地使用指针和主体存储旁边的一些表。压缩利用直接随机寻址能力。与特定文档有关的所有过帐都可以存储在一起,从而易于扩展和更新。关于常规技术,存储需求大约减少了一半。

著录项

  • 公开/公告号EP0508519A2

    专利类型

  • 公开/公告日1992-10-14

    原文格式PDF

  • 申请/专利权人 PHILIPS ELECTRONICS N.V.;

    申请/专利号EP19920200891

  • 发明设计人 AALBERSBERG IJSBRAND JAN;

    申请日1992-03-30

  • 分类号G06F15/401;

  • 国家 EP

  • 入库时间 2022-08-22 05:28:54

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号