首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Efficient signature file methods for text retrieval
【24h】

Efficient signature file methods for text retrieval

机译:文本检索的有效签名文件方法

获取原文
获取原文并翻译 | 示例

摘要

Signature files have been studied extensively, as an access method for textual databases. Many approaches have been proposed for searching signatures files efficiently. However, different methods make different assumptions and use different performance measures, making it difficult to compare their performance. In this paper, we study three basic methods proposed in the literature, namely, the indexed descriptor file, the two-level superimposed coding scheme, and the partitioned signature file approach. The contribution of this paper is two-fold. First, we present a uniform analytical performance model so that the methods can be compared fairly and consistently. The analysis shows that the two-level superimposed coding scheme, if stored in a transposed file, has the best performance. Second, we extend the two-level superimposed coding method into a multilevel superimposed coding method, we obtain the optimal number of levels for the multilevel method and show that for databases with reasonable size the optimal value is much larger than 2, which is assumed in the two-level method. The accuracy of the analytical formula is demonstrated by simulation.
机译:作为文本数据库的访问方法,签名文件已得到广泛研究。已经提出了许多方法来有效地搜索签名文件。但是,不同的方法做出不同的假设并使用不同的性能指标,因此很难比较它们的性能。在本文中,我们研究了文献中提出的三种基本方法,即索引描述符文件,两级叠加编码方案和分区签名文件方法。本文的贡献是双重的。首先,我们提出一个统一的分析性能模型,以便可以公平,一致地比较这些方法。分析表明,如果将二级叠加编码方案存储在转置文件中,则其性能最佳。其次,我们将两级叠加编码方法扩展为多级叠加编码方法,我们获得了该多级方法的最佳级别数,并表明对于合理大小的数据库,最佳值远大于2,这是假定为两级方法。通过仿真证明了解析公式的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号