首页> 外文会议>ACM/IEEE-CS joint conference on digital libraries >WARCreate - Create Wayback-Consumable WARC Files from Any Webpage
【24h】

WARCreate - Create Wayback-Consumable WARC Files from Any Webpage

机译:WARCREATE - 从任何网页创建WATHBACK-COURKABLE WARC文件

获取原文

摘要

The Internet Archive's Wayback Machine is the most common way that typical users interact with web archives. The Internet Archive uses the Heritrix web crawler to transform pages on the publicly available web into Web ARChive (WARC) files, which can then be accessed using the Way-back Machine. Because Heritrix can only access the publicly available web, many personal pages (e.g., password-protected pages, social media pages) cannot be easily archived into the standard WARC format. We have created a Google Chrome extension. WARCreate, that allows a user to create a WARC file from any webpage. Using this tool, content that might have been otherwise lost in time can be archived in a standard format by any user. This tool provides a way for casual users to easily create archives of personal online content. This is one of the first steps in resolving issues of "long term storage, maintenance, and access of personal digital assets that have emotional, intellectual, and historical value to individuals" [3].
机译:Internet Archive的Wayback机器是典型用户与Web Archives互动的最常见方式。 Internet归档使用Heritrix Web爬网程序将公开的网站上的页面转换为Web归档(WARC)文件,然后可以使用方式访问的机器访问。由于Heritrix只能访问公开的Web,因此许多个人页面(例如,受密码保护的页面,社交媒体页面)不能轻易存档到标准WARC格式中。我们创建了一个Google Chrome扩展名。 Warcreate,允许用户从任何网页创建WARC文件。使用此工具,可能会及时丢失的内容可以由任何用户以标准格式归档。此工具为休闲用户提供了一种方法,可以轻松创建个人在线内容的档案。这是解决“长期储存,维护和个人资产的问题的第一步之一,对个人具有情感,智力和历史价值”[3]。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号