Profiling Web Archive Coverage for Top-Level Domain and Content Language

机译：分析顶级域和内容语言的Web归档覆盖范围

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The Memento aggregator currently polls every known public web archive when serving a request for an archived web page, even though some web archives focus on only specific domains and ignore the others. Similar to query routing in distributed search, we investigate the impact on aggregated Memento TimeMaps (lists of when and where a web page was archived) by only sending queries to archives likely to hold the archived page. We profile twelve public web archives using data from a variety of sources (the web, archives' access logs, and full-text queries to archives) and discover that only sending queries to the top three web archives (i.e., a 75% reduction in the number of queries) for any request produces the full TimeMaps on 84% of the cases.

机译：Memento Aggregator目前在为存档网页的请求提供请求时对每个已知的公共网络归档进行调查，即使某些Web档案只关注特定域并忽略其他域。类似于在分布式搜索中的查询路由，我们通过仅向可能包含归档页面的归档，调查对聚合Memento TimeMaps（何时返回的何时归档的何时归档的何时返回Web页面的列表）。我们将十二个公共网络归档使用来自各种源（Web，归档'访问日志以及归档的全文查询）进行配置文件，并发现仅向前三个Web档案发送查询（即，减少75％任何请求的查询数量会在84％的情况下产生完整的时间映射。

著录项

来源
《International Conference on Theory and Practice of Digital Libraries》|2013年||共12页
会议地点
作者
Ahmed Alsum; Michele C. Weigle; Michael L. Nelson; Herbert Van de Sompel;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 G250.76-53;
关键词
Web archive; Query routing; Memento aggregator;

机译：Web存档;查询路由;MEMENTO AGGREGATTOR;

相似文献

外文文献
中文文献
专利

1. Profiling web archive coverage for top-level domain and content language [J] . Ahmed AlSum, Michele C. Weigle, Michael L. Nelson, International journal on digital libraries . 2014,第3a4期

机译：针对顶级域和内容语言分析Web存档覆盖率
2. The Effect of Top-Level Domains and Advertisements on Health Web Site Credibility [J] . Joseph B Walther, Zuoming Wang, Tracy Loh Journal of medical Internet research . 2004,第3期

机译：顶级域和广告对卫生网站信誉的影响
3. The Web Archives Workbench (WAW) Tool Suite: Taking an Archival Approach to the Preservation of Web Content [J] . Patricia Hswe, Joanne Kaczmarek, Leah Houser, Library trends . 2009,第3期

机译：Web存档工作台（WAW）工具套件：采用存档方法来保存Web内容
4. Profiling Web Archive Coverage for Top-Level Domain and Content Language [C] . Ahmed Alsum, Michele C. Weigle, Michael L. Nelson, International conference on theory and practice of digital libraries . 2013

机译：对顶级域和内容语言的Web存档覆盖范围进行概要分析
5. Access to the archives? Art museum websites and online archives in the public domain [D] . Pastore, Erica M. 2008

机译：访问档案？公共领域的美术馆网站和在线档案
6. Into the Dark Domain: The UK Web Archive as a Source for the Contemporary History of Public Health [O] . Martin Gorsky -1

机译：进入黑暗领域：英国网络档案馆作为当代公共卫生历史的资料来源
7. Profiling Web Archive Coverage for Top-Level Domain and Content Language [O] . Ahmed Alsum, Michele C. Weigle, Michael L. Nelson, 2014

机译：分析顶级域和内容语言的Web存档覆盖范围

Profiling Web Archive Coverage for Top-Level Domain and Content Language

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅