首页> 外文学位 >AN EVALUATION OF THE APPLICABILITY OF RANKING ALGORITHMS TO IMPROVING THE EFFECTIVENESS OF FULL TEXT RETRIEVAL
【24h】

AN EVALUATION OF THE APPLICABILITY OF RANKING ALGORITHMS TO IMPROVING THE EFFECTIVENESS OF FULL TEXT RETRIEVAL

机译:排序算法对提高全文检索效率的适用性评估

获取原文
获取原文并翻译 | 示例

摘要

It is generally accepted that information retrieval based on full texts of documents will result in higher recall and lower precision compared with retrieval using paragraphs, abstracts, or controlled vocabularies. The present study tested this assumption by examining the effectiveness of full text retrieval compared with other approaches in terms of recall and precision. It then focused on how to improve the precision of full text retrieval with a minimum decrease in recall.;A subset of Harvard Business Review records for the time period of Jaunary 1979 - August 1983 was selected for analysis, and nine search questions were selected to test the effectiveness of full text retrieval. Twenty-nine weighting algorithms developed for automatic extractive indexing were examined as means of improving the precision of full text retrieval.;There was a significant difference in retrieval effectiveness between full text and other methods, at the .0001 level for recall and at the .0032 level for precision. That is, full text retrieval achieved significantly higher recall and lower precision ratios than paragraph, abstract, or controlled vocabulary searching. There was also a significant difference at the .0290 level between full text retrieval with weighting algorithms and full text retrieval without weighting algorithms for precision measured at a fixed level of recall. All 29 algorithms improved the precision of full text retrieval over that without algorithms. Twenty-two of 29 algorithms achieved higher precision than that of paragraph searching, although these results were not statistically significant.;However, there was no significant difference between algorithms. The relative performance of algorithms seemed to depend on the search strategy employed and level of recall achieved. In general an algorithm which gave high weights to common words and low weights to rare words in the Boolean intersection performed well at low levels of recall. The performance of algorithms did not vary much at high levels of recall.
机译:人们普遍认为,与使用段落,摘要或受控词汇进行检索相比,基于文档全文的信息检索将导致更高的查全率和更低的准确性。本研究通过检查全文检索与其他方法相比在查全率和精确度方面的有效性,对这一假设进行了检验。然后,重点研究了如何以最小的召回率降低全文检索的准确性。选择了1979年1月至1983年8月这段时间的《哈佛商业评论》记录子集进行分析,并选择了9个搜索问题测试全文检索的有效性。为了提高全文检索的准确性,对为自动提取索引而开发的29种加权算法进行了研究。全文检索方法与其他方法之间的检索效果存在显着差异,在.0001级别上和在。 0032级的精度。也就是说,与段落,摘要或受控词汇搜索相比,全文检索实现了更高的查全率和更低的准确率。在.0290级别上,具有加权算法的全文本检索与不带加权算法的全文本检索之间在固定召回水平下测得的精度之间也存在显着差异。与没有算法的情况相比,所有29种算法都提高了全文检索的精度。尽管这些结果在统计上并不显着,但29个算法中有22个算法比段落搜索具有更高的精度。但是,这些算法之间没有显着差异。算法的相对性能似乎取决于所采用的搜索策略和获得的召回水平。通常,在布尔回合中给普通单词赋予高权重而给稀有单词赋予低权重的算法在低召回率下表现良好。在高召回率下,算法的性能变化不大。

著录项

  • 作者

    RO, JUNG SOON.;

  • 作者单位

    Indiana University.;

  • 授予单位 Indiana University.;
  • 学科 Information science.
  • 学位 Ph.D.
  • 年度 1985
  • 页码 224 p.
  • 总页数 224
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号