首页>
外国专利>
Methods for encoder interface and data search by signatures combinatorias
Methods for encoder interface and data search by signatures combinatorias
展开▼
机译:通过签名组合器进行编码器接口和数据搜索的方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
A data base management system encodes information (such as the field values of a database record, or the words of a text document) so that the original information may be efficiently searched by a computer. An information object is encoded into a small "signature" or codeword. A base or "leaf" signature is computed by a known technique such as hashing. The logical intersection (AND) of each possible combination of pairs of bits of the base signature is computed, and the result is stored as one bit of a longer combinatorial signature. The bit-wise logical union (bit-OR) of the combinatorial signatures of a group of records produces a second-level combinatorial signature representing particular field values present among those records. Higher-level combinatorial signatures are computed similarly. These combinatorial signatures avoid a "saturation" problem which occurs when signatures are grouped together, and a "combinatorial error" problem which falsely indicates the existence of non-existent records, thereby significantly improving the ability to reject data not relevant to a given query. When the combinatorial signatures are stored in a hierarchical data structure, such as a B- tree index of a database management system, they provide means for more efficiently searching database records or document text by eliminating large amounts of non-matching data from further consideration.
展开▼