首页> 外文会议>9th International conference on language resources and evaluation >Evaluating Web-as-corpus Topical Document Retrieval with an Index of the OpenDirectory
【24h】

Evaluating Web-as-corpus Topical Document Retrieval with an Index of the OpenDirectory

机译:使用OpenDirectory的索引评估Web整体主题文档检索

获取原文

摘要

This article introduces a novel protocol and resource to evaluate Web-as-corpus topical document retrieval. To the contrary of previous work, our goal is to provide an automatic, reproducible and robust evaluation for this task. We rely on the OpenDirectory (DMOZ) as a source of topically annotated webpages and index them in a search engine. With this OpenDirectory search engine, we can then easily evaluate the impact of various parameters such as the number of seed terms, queries or documents, or the usefulness of various term selection algorithms. A first fully automatic evaluation is described and provides baseline performances for this task. The article concludes with practical information regarding the availability of the index and resource files.
机译:本文介绍了一种新颖的协议和资源,用于评估作为主题的Web主题的文档检索。与以前的工作相反,我们的目标是为此任务提供自动,可重现和强大的评估。我们依靠OpenDirectory(DMOZ)作为局部注释网页的来源,并在搜索引擎中为它们建立索引。使用此OpenDirectory搜索引擎,我们可以轻松地评估各种参数的影响,例如种子术语,查询或文档的数量,或各种术语选择算法的实用性。描述了第一个全自动评估,并提供了此任务的基准性能。本文以有关索引和资源文件的可用性的实用信息作为结束。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号