首页> 外文会议>International Conference on Applications of Natural Language to Information Systems >Ranked-Listed or Categorized Results in IR: 2 Is Better Than 1
【24h】

Ranked-Listed or Categorized Results in IR: 2 Is Better Than 1

机译:IR:2中排名列出或分类结果优于1

获取原文
获取外文期刊封面目录资料

摘要

In this paper we examine the performance of both ranked-listed and categorized results in the context of known-item search (target testing). Performance of known-item search is easy to quantify based on the number of examined documents and class descriptions. Results are reported on a subset of the Open Directory classification hierarchy, which enable us to control the error rate and investigate how performance degrades with error. Three types of simulated user model are identified together with the two operating scenarios of correct and incorrect classification. Extensive empirical testing reveals that in the ideal scenario, i.e. perfect classification by both human and machine, a category-based system significantly outperforms a ranked list for all but the best queries, i.e. queries for which the target document was initially retrieved in the top-5. When either human or machine error occurs, and the user performs a search strategy that is exclusively category based, then performance is much worse than for a ranked list. However, most interestingly, if the user follows a hybrid strategy of first looking in the expected category and then reverting to a ranked list if the target is absent, then performance can remain significantly better than for a ranked list, even with misclassification rates as high as 30%. We also observe that this hybrid strategy results in performance degradations that degrade gracefully with error rate.
机译:在本文中,我们在已知项目搜索(目标测试)的上下文中,检查排序列出和分类结果的性能。根据审查的文档和类描述的数量,易于量化的已知项目搜索的性能。结果在打开目录分类层次结构的子集上报告,这使我们能够控制错误率并调查性能如何使用错误劣化。三种类型的模拟用户模型与两个正确和不正确的分类的两个操作场景一起识别。广泛的经验测试显示,在理想的情况下,即人类和机器的完美分类,基于类别的系统显着优于所有除了最佳查询中的排名列表,即最初在顶部检索目标文档的查询5。当发生人工或机器错误时,用户执行独家类别的搜索策略,则性能比排名列表更糟糕。但是,最有趣的是,如果用户遵循第一次查看预期类别的混合策略,然后在排名列表中恢复到一个排名的列表,如果目标不存在,则性能可以保持明显比排名列表更好,即使具有较高的错误分类速率为30%。我们还观察到这种混合策略导致性能下降,以误差率优雅地降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号