首页> 美国卫生研究院文献>PLoS Clinical Trials >Scholarly Context Adrift: Three out of Four URI References Lead to Changed Content
【2h】

Scholarly Context Adrift: Three out of Four URI References Lead to Changed Content

机译:学术上下文漂移:四个URI参考中的三个导致内容更改

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Increasingly, scholarly articles contain URI references to “web at large” resources including project web sites, scholarly wikis, ontologies, online debates, presentations, blogs, and videos. Authors reference such resources to provide essential context for the research they report on. A reader who visits a web at large resource by following a URI reference in an article, some time after its publication, is led to believe that the resource’s content is representative of what the author originally referenced. However, due to the dynamic nature of the web, that may very well not be the case. We reuse a dataset from a previous study in which several authors of this paper were involved, and investigate to what extent the textual content of web at large resources referenced in a vast collection of Science, Technology, and Medicine (STM) articles published between 1997 and 2012 has remained stable since the publication of the referencing article. We do so in a two-step approach that relies on various well-established similarity measures to compare textual content. In a first step, we use 19 web archives to find snapshots of referenced web at large resources that have textual content that is representative of the state of the resource around the time of publication of the referencing paper. We find that representative snapshots exist for about 30% of all URI references. In a second step, we compare the textual content of representative snapshots with that of their live web counterparts. We find that for over 75% of references the content has drifted away from what it was when referenced. These results raise significant concerns regarding the long term integrity of the web-based scholarly record and call for the deployment of techniques to combat these problems.
机译:学术文章越来越多地包含对“一般网络”资源的URI引用,这些资源包括项目网站,学术Wiki,本体论,在线辩论,演示,博客和视频。作者引用这些资源为他们报告的研究提供必要的背景。读者在发表某篇文章后不久,通过遵循文章中的URI参考来访问具有大量资源的网络,这使该读者相信该资源的内容代表了作者最初引用的内容。但是,由于网络的动态性质,情况可能并非如此。我们重用了先前研究的数据集,该研究涉及本文的几位作者,并调查了1997年之间发表的大量科学,技术和医学(STM)文章中引用的大量资源的网络文本内容的程度。自引用文献发表以来,2012年一直保持稳定。我们采用两步方法,依靠各种公认的相似性度量来比较文本内容。第一步,我们使用19个Web档案库在大量资源中查找被引用Web的快照,这些快照的文本内容代表了在参考文献发表时的资源状态。我们发现代表性的快照存在于所有URI引用的大约30%中。在第二步中,我们将代表性快照的文本内容与实时网络快照的文本内容进行比较。我们发现,超过75%的参考文献的内容已经偏离了参考文献的内容。这些结果引起了人们对基于网络的学术记录的长期完整性的极大关注,并要求采用技术来解决这些问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号