首页> 外文会议>International Conference on Theory and Practice of Digital Libraries >Find, New, Copy, Web, Page - Tagging for the (Re-)Discovery of Web Pages
【24h】

Find, New, Copy, Web, Page - Tagging for the (Re-)Discovery of Web Pages

机译:查找,新,副本,网络,页面 - 标记(重新)网页发现

获取原文
获取外文期刊封面目录资料

摘要

The World Wide Web has a very dynamic character with resources constantly disappearing and (re-)surfacing. A ubiquitous result is the "404 Page not Found" error as the request for missing web pages. We investigate tags obtained from Delicious for the purpose of rediscovering such missing web pages with the help of search engines. We determine the best performing tag based query length, quantify the relevance of the results and compare tags to retrieval methods based on a page's content. We find that tags are only useful in addition to content based methods. We further introduce the notion of "ghost tags", terms used as tags that do not occur in the current but did occur in a previous version of the web page. One third of these ghost tags are ranked high in Delicious and also occurred frequently in the document which indicates their importance to both the user and the content of the document.
机译:万维网具有一个非常动态的特征,资源不断消失和(重新)浮出水面。普遍存在的结果是“未找到404页”错误作为缺少网页的请求。我们调查从可口获得的标签,以便在搜索引擎的帮助下重新发现此类丢失的网页。我们确定基于最佳的基于标记的查询长度,量化结果的相关性并将标记与页面的内容进行比较到检索方法。我们发现,除了基于内容的方法之外,标签只有有用。我们进一步介绍了“幽灵标记”的概念,术语用作当前不发生的标签,但在网页的先前版本中发生。这些幽灵标记中的三分之一在美味中排名高,并且在文档中也经常发生,这表明他们对用户的重要性和文档的内容。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号