首页> 外文会议>Advances in information retrieval >Analyzing Information Retrieval Methods to Recover Broken Web Links
【24h】

Analyzing Information Retrieval Methods to Recover Broken Web Links

机译:分析信息检索方法以恢复断开的Web链接

获取原文
获取原文并翻译 | 示例

摘要

In this work we compare different techniques to automatically find candidate web pages to substitute broken links. We extract information from the anchor text, the content of the page containing the link, and the cache page in some digital library. The selected information is processed and submitted to a search engine. We have compared different information retrieval methods for both, the selection of terms used to construct the queries submitted to the search engine, and the ranking of the candidate pages that it provides, in order to help the user to find the best replacement. In particular, we have used term frequencies, and a language model approach for the selection of terms; and cooccurrence measures and a language model approach for ranking the final results. To test the different methods, we have also defined a methodology which does not require the user judgments, what increases the objectivity of the results.
机译:在这项工作中,我们比较了不同的技术来自动查找候选网页来替换断开的链接。我们从锚文本,包含链接的页面内容以及某些数字图书馆中的缓存页面中提取信息。所选信息将被处理并提交给搜索引擎。我们比较了两种不同的信息检索方法,即用于构造提交给搜索引擎的查询的术语的选择以及它提供的候选页面的排名,以帮助用户找到最佳的替换方法。特别是,我们使用术语频率和语言模型方法来选择术语。并发措施和语言模型方法来对最终结果进行排名。为了测试不同的方法,我们还定义了一种方法,该方法不需要用户判断,从而增加了结果的客观性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号