首页> 外文期刊>International journal on digital libraries >The impact of JavaScript on archivability
【24h】

The impact of JavaScript on archivability

机译:JavaScript对可归档性的影响

获取原文
获取原文并翻译 | 示例
       

摘要

As web technologies evolve, web archivists work to adapt so that digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts (Ajax) that, for example, load data without a change in top level Universal Resource Identifier (URI) or require user interaction (e.g., content loading via Ajax when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. In an effort to understand why mementos (archived versions of live resources) in today's archives vary in completeness and sometimes pull content from the live web, we present a study of web resources and archival tools. We used a collection of URIs shared over Twitter and a collection of URIs curated by Archive-It in our investigation. We created local archived versions of the URIs from the Twitter and Archive-It sets using WebCite, wget, and the Heritrix crawler. We found that only 4.2 % of the Twitter collection is perfectly archived by all of these tools, while 34.2 % of the Archive-It collection is perfectly archived. After studying the quality of these mementos, we identified the practice of loading resources via JavaScript (Ajax) as the source of archival difficulty. Further, we show that resources are increasing their use of JavaScript to load embedded resources. By 2012, over half (54.5 %) of pages use JavaScript to load embedded resources. The number of embedded resources loaded via JavaScript has increased by 12.0 % from 2005 to 2012. We also show that JavaScript is responsible for 33.2 % more missing resources in 2012 than in 2005. This shows that JavaScript is responsible for an increasing proportion of the embedded resources unsuccessfully loaded by mementos. JavaScript is also responsible for 52.7 % of all missing embedded resources in our study.
机译:随着网络技术的发展,网络档案管理员努力进行调整,以保留数字历史记录。 Web技术的最新进展引入了客户端执行脚本(Ajax),例如,无需更改顶级通用资源标识符(URI)即可加载数据或需要用户交互(例如,页面滚动时通过Ajax加载内容) )。这些进步使得用于捕获网页的自动化方法更加困难。为了理解为什么今天的档案中的备忘录(实时资源的存档版本)的完整性会发生变化,有时会从实时网络中提取内容,我们提出了一项对网络资源和存档工具的研究。在调查中,我们使用了在Twitter上共享的URI集合以及由Archive-It策划的URI集合。我们使用WebCite,wget和Heritrix搜寻器从Twitter和Archive-It集中创建了URI的本地存档版本。我们发现,所有这些工具仅对Twitter集合的4.2%进行了完美存档,而对Archive-It集合的34.2%则进行了完美存档。在研究了这些纪念品的质量之后,我们确定了通过JavaScript(Ajax)加载资源的做法是存档困难的根源。此外,我们表明资源正在越来越多地使用JavaScript来加载嵌入式资源。到2012年,超过一半(54.5%)的页面使用JavaScript加载嵌入式资源。从2005年到2012年,通过JavaScript加载的嵌入式资源数量增加了12.0%。我们还显示,与2012年相比,JavaScript在2012年造成的丢失资源增加了33.2%。这表明,JavaScript在嵌入式资源中所占的比例越来越高纪念品无法成功加载资源。在我们的研究中,JavaScript还占所有缺少的嵌入式资源的52.7%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号